David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1 | =================================== |
| 2 | Documentation for /proc/sys/kernel/ |
| 3 | =================================== |
| 4 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 5 | .. See scripts/check-sysctl-docs to keep this up to date |
| 6 | |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 7 | |
| 8 | Copyright (c) 1998, 1999, Rik van Riel <riel@nl.linux.org> |
| 9 | |
| 10 | Copyright (c) 2009, Shen Feng<shen@cn.fujitsu.com> |
| 11 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 12 | For general info and legal blurb, please look in :doc:`index`. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 13 | |
| 14 | ------------------------------------------------------------------------------ |
| 15 | |
| 16 | This file contains documentation for the sysctl files in |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 17 | ``/proc/sys/kernel/`` and is valid for Linux kernel version 2.2. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 18 | |
| 19 | The files in this directory can be used to tune and monitor |
| 20 | miscellaneous and general things in the operation of the Linux |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 21 | kernel. Since some of the files *can* be used to screw up your |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 22 | system, it is advisable to read both documentation and source |
| 23 | before actually making adjustments. |
| 24 | |
| 25 | Currently, these files might (depending on your configuration) |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 26 | show up in ``/proc/sys/kernel``: |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 27 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 28 | .. contents:: :local: |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 29 | |
| 30 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 31 | acct |
| 32 | ==== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 33 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 34 | :: |
| 35 | |
| 36 | highwater lowwater frequency |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 37 | |
| 38 | If BSD-style process accounting is enabled these values control |
| 39 | its behaviour. If free space on filesystem where the log lives |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 40 | goes below ``lowwater``% accounting suspends. If free space gets |
| 41 | above ``highwater``% accounting resumes. ``frequency`` determines |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 42 | how often do we check the amount of free space (value is in |
| 43 | seconds). Default: |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 44 | |
| 45 | :: |
| 46 | |
| 47 | 4 2 30 |
| 48 | |
| 49 | That is, suspend accounting if free space drops below 2%; resume it |
| 50 | if it increases to at least 4%; consider information about amount of |
| 51 | free space valid for 30 seconds. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 52 | |
| 53 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 54 | acpi_video_flags |
| 55 | ================ |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 56 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 57 | See :doc:`/power/video`. This allows the video resume mode to be set, |
| 58 | in a similar fashion to the ``acpi_sleep`` kernel parameter, by |
| 59 | combining the following values: |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 60 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 61 | = ======= |
| 62 | 1 s3_bios |
| 63 | 2 s3_mode |
| 64 | 4 s3_beep |
| 65 | = ======= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 66 | |
| 67 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 68 | auto_msgmni |
| 69 | =========== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 70 | |
| 71 | This variable has no effect and may be removed in future kernel |
| 72 | releases. Reading it always returns 0. |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 73 | Up to Linux 3.17, it enabled/disabled automatic recomputing of |
| 74 | `msgmni`_ |
| 75 | upon memory add/remove or upon IPC namespace creation/removal. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 76 | Echoing "1" into this file enabled msgmni automatic recomputing. |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 77 | Echoing "0" turned it off. The default value was 1. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 78 | |
| 79 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 80 | bootloader_type (x86 only) |
| 81 | ========================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 82 | |
| 83 | This gives the bootloader type number as indicated by the bootloader, |
| 84 | shifted left by 4, and OR'd with the low four bits of the bootloader |
| 85 | version. The reason for this encoding is that this used to match the |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 86 | ``type_of_loader`` field in the kernel header; the encoding is kept for |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 87 | backwards compatibility. That is, if the full bootloader type number |
| 88 | is 0x15 and the full version number is 0x234, this file will contain |
| 89 | the value 340 = 0x154. |
| 90 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 91 | See the ``type_of_loader`` and ``ext_loader_type`` fields in |
| 92 | :doc:`/x86/boot` for additional information. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 93 | |
| 94 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 95 | bootloader_version (x86 only) |
| 96 | ============================= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 97 | |
| 98 | The complete bootloader version number. In the example above, this |
| 99 | file will contain the value 564 = 0x234. |
| 100 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 101 | See the ``type_of_loader`` and ``ext_loader_ver`` fields in |
| 102 | :doc:`/x86/boot` for additional information. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 103 | |
| 104 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 105 | bpf_stats_enabled |
| 106 | ================= |
| 107 | |
| 108 | Controls whether the kernel should collect statistics on BPF programs |
| 109 | (total time spent running, number of times run...). Enabling |
| 110 | statistics causes a slight reduction in performance on each program |
| 111 | run. The statistics can be seen using ``bpftool``. |
| 112 | |
| 113 | = =================================== |
| 114 | 0 Don't collect statistics (default). |
| 115 | 1 Collect statistics. |
| 116 | = =================================== |
| 117 | |
| 118 | |
| 119 | cad_pid |
| 120 | ======= |
| 121 | |
| 122 | This is the pid which will be signalled on reboot (notably, by |
| 123 | Ctrl-Alt-Delete). Writing a value to this file which doesn't |
| 124 | correspond to a running process will result in ``-ESRCH``. |
| 125 | |
| 126 | See also `ctrl-alt-del`_. |
| 127 | |
| 128 | |
| 129 | cap_last_cap |
| 130 | ============ |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 131 | |
| 132 | Highest valid capability of the running kernel. Exports |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 133 | ``CAP_LAST_CAP`` from the kernel. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 134 | |
| 135 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 136 | core_pattern |
| 137 | ============ |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 138 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 139 | ``core_pattern`` is used to specify a core dumpfile pattern name. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 140 | |
| 141 | * max length 127 characters; default value is "core" |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 142 | * ``core_pattern`` is used as a pattern template for the output |
| 143 | filename; certain string patterns (beginning with '%') are |
| 144 | substituted with their actual values. |
| 145 | * backward compatibility with ``core_uses_pid``: |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 146 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 147 | If ``core_pattern`` does not include "%p" (default does not) |
| 148 | and ``core_uses_pid`` is set, then .PID will be appended to |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 149 | the filename. |
| 150 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 151 | * corename format specifiers |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 152 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 153 | ======== ========================================== |
| 154 | %<NUL> '%' is dropped |
| 155 | %% output one '%' |
| 156 | %p pid |
| 157 | %P global pid (init PID namespace) |
| 158 | %i tid |
| 159 | %I global tid (init PID namespace) |
| 160 | %u uid (in initial user namespace) |
| 161 | %g gid (in initial user namespace) |
| 162 | %d dump mode, matches ``PR_SET_DUMPABLE`` and |
| 163 | ``/proc/sys/fs/suid_dumpable`` |
| 164 | %s signal number |
| 165 | %t UNIX time of dump |
| 166 | %h hostname |
| 167 | %e executable filename (may be shortened, could be changed by prctl etc) |
| 168 | %f executable filename |
| 169 | %E executable path |
| 170 | %c maximum size of core file by resource limit RLIMIT_CORE |
| 171 | %<OTHER> both are dropped |
| 172 | ======== ========================================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 173 | |
| 174 | * If the first character of the pattern is a '|', the kernel will treat |
| 175 | the rest of the pattern as a command to run. The core dump will be |
| 176 | written to the standard input of that program instead of to a file. |
| 177 | |
| 178 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 179 | core_pipe_limit |
| 180 | =============== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 181 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 182 | This sysctl is only applicable when `core_pattern`_ is configured to |
| 183 | pipe core files to a user space helper (when the first character of |
| 184 | ``core_pattern`` is a '|', see above). |
| 185 | When collecting cores via a pipe to an application, it is occasionally |
| 186 | useful for the collecting application to gather data about the |
| 187 | crashing process from its ``/proc/pid`` directory. |
| 188 | In order to do this safely, the kernel must wait for the collecting |
| 189 | process to exit, so as not to remove the crashing processes proc files |
| 190 | prematurely. |
| 191 | This in turn creates the possibility that a misbehaving userspace |
| 192 | collecting process can block the reaping of a crashed process simply |
| 193 | by never exiting. |
| 194 | This sysctl defends against that. |
| 195 | It defines how many concurrent crashing processes may be piped to user |
| 196 | space applications in parallel. |
| 197 | If this value is exceeded, then those crashing processes above that |
| 198 | value are noted via the kernel log and their cores are skipped. |
| 199 | 0 is a special value, indicating that unlimited processes may be |
| 200 | captured in parallel, but that no waiting will take place (i.e. the |
| 201 | collecting process is not guaranteed access to ``/proc/<crashing |
| 202 | pid>/``). |
| 203 | This value defaults to 0. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 204 | |
| 205 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 206 | core_uses_pid |
| 207 | ============= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 208 | |
| 209 | The default coredump filename is "core". By setting |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 210 | ``core_uses_pid`` to 1, the coredump filename becomes core.PID. |
| 211 | If `core_pattern`_ does not include "%p" (default does not) |
| 212 | and ``core_uses_pid`` is set, then .PID will be appended to |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 213 | the filename. |
| 214 | |
| 215 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 216 | ctrl-alt-del |
| 217 | ============ |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 218 | |
| 219 | When the value in this file is 0, ctrl-alt-del is trapped and |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 220 | sent to the ``init(1)`` program to handle a graceful restart. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 221 | When, however, the value is > 0, Linux's reaction to a Vulcan |
| 222 | Nerve Pinch (tm) will be an immediate reboot, without even |
| 223 | syncing its dirty buffers. |
| 224 | |
| 225 | Note: |
| 226 | when a program (like dosemu) has the keyboard in 'raw' |
| 227 | mode, the ctrl-alt-del is intercepted by the program before it |
| 228 | ever reaches the kernel tty layer, and it's up to the program |
| 229 | to decide what to do with it. |
| 230 | |
| 231 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 232 | dmesg_restrict |
| 233 | ============== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 234 | |
| 235 | This toggle indicates whether unprivileged users are prevented |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 236 | from using ``dmesg(8)`` to view messages from the kernel's log |
| 237 | buffer. |
| 238 | When ``dmesg_restrict`` is set to 0 there are no restrictions. |
| 239 | When ``dmesg_restrict`` is set to 1, users must have |
| 240 | ``CAP_SYSLOG`` to use ``dmesg(8)``. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 241 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 242 | The kernel config option ``CONFIG_SECURITY_DMESG_RESTRICT`` sets the |
| 243 | default value of ``dmesg_restrict``. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 244 | |
| 245 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 246 | domainname & hostname |
| 247 | ===================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 248 | |
| 249 | These files can be used to set the NIS/YP domainname and the |
| 250 | hostname of your box in exactly the same way as the commands |
| 251 | domainname and hostname, i.e.:: |
| 252 | |
| 253 | # echo "darkstar" > /proc/sys/kernel/hostname |
| 254 | # echo "mydomain" > /proc/sys/kernel/domainname |
| 255 | |
| 256 | has the same effect as:: |
| 257 | |
| 258 | # hostname "darkstar" |
| 259 | # domainname "mydomain" |
| 260 | |
| 261 | Note, however, that the classic darkstar.frop.org has the |
| 262 | hostname "darkstar" and DNS (Internet Domain Name Server) |
| 263 | domainname "frop.org", not to be confused with the NIS (Network |
| 264 | Information Service) or YP (Yellow Pages) domainname. These two |
| 265 | domain names are in general different. For a detailed discussion |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 266 | see the ``hostname(1)`` man page. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 267 | |
| 268 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 269 | firmware_config |
| 270 | =============== |
| 271 | |
| 272 | See :doc:`/driver-api/firmware/fallback-mechanisms`. |
| 273 | |
| 274 | The entries in this directory allow the firmware loader helper |
| 275 | fallback to be controlled: |
| 276 | |
| 277 | * ``force_sysfs_fallback``, when set to 1, forces the use of the |
| 278 | fallback; |
| 279 | * ``ignore_sysfs_fallback``, when set to 1, ignores any fallback. |
| 280 | |
| 281 | |
| 282 | ftrace_dump_on_oops |
| 283 | =================== |
| 284 | |
| 285 | Determines whether ``ftrace_dump()`` should be called on an oops (or |
| 286 | kernel panic). This will output the contents of the ftrace buffers to |
| 287 | the console. This is very useful for capturing traces that lead to |
| 288 | crashes and outputting them to a serial console. |
| 289 | |
| 290 | = =================================================== |
| 291 | 0 Disabled (default). |
| 292 | 1 Dump buffers of all CPUs. |
| 293 | 2 Dump the buffer of the CPU that triggered the oops. |
| 294 | = =================================================== |
| 295 | |
| 296 | |
| 297 | ftrace_enabled, stack_tracer_enabled |
| 298 | ==================================== |
| 299 | |
| 300 | See :doc:`/trace/ftrace`. |
| 301 | |
| 302 | |
| 303 | hardlockup_all_cpu_backtrace |
| 304 | ============================ |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 305 | |
| 306 | This value controls the hard lockup detector behavior when a hard |
| 307 | lockup condition is detected as to whether or not to gather further |
| 308 | debug information. If enabled, arch-specific all-CPU stack dumping |
| 309 | will be initiated. |
| 310 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 311 | = ============================================ |
| 312 | 0 Do nothing. This is the default behavior. |
| 313 | 1 On detection capture more debug information. |
| 314 | = ============================================ |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 315 | |
| 316 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 317 | hardlockup_panic |
| 318 | ================ |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 319 | |
| 320 | This parameter can be used to control whether the kernel panics |
| 321 | when a hard lockup is detected. |
| 322 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 323 | = =========================== |
| 324 | 0 Don't panic on hard lockup. |
| 325 | 1 Panic on hard lockup. |
| 326 | = =========================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 327 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 328 | See :doc:`/admin-guide/lockup-watchdogs` for more information. |
| 329 | This can also be set using the nmi_watchdog kernel parameter. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 330 | |
| 331 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 332 | hotplug |
| 333 | ======= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 334 | |
| 335 | Path for the hotplug policy agent. |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 336 | Default value is "``/sbin/hotplug``". |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 337 | |
| 338 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 339 | hung_task_all_cpu_backtrace |
| 340 | =========================== |
| 341 | |
| 342 | If this option is set, the kernel will send an NMI to all CPUs to dump |
| 343 | their backtraces when a hung task is detected. This file shows up if |
| 344 | CONFIG_DETECT_HUNG_TASK and CONFIG_SMP are enabled. |
| 345 | |
| 346 | 0: Won't show all CPUs backtraces when a hung task is detected. |
| 347 | This is the default behavior. |
| 348 | |
| 349 | 1: Will non-maskably interrupt all CPUs and dump their backtraces when |
| 350 | a hung task is detected. |
| 351 | |
| 352 | |
| 353 | hung_task_panic |
| 354 | =============== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 355 | |
| 356 | Controls the kernel's behavior when a hung task is detected. |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 357 | This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 358 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 359 | = ================================================= |
| 360 | 0 Continue operation. This is the default behavior. |
| 361 | 1 Panic immediately. |
| 362 | = ================================================= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 363 | |
| 364 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 365 | hung_task_check_count |
| 366 | ===================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 367 | |
| 368 | The upper bound on the number of tasks that are checked. |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 369 | This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 370 | |
| 371 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 372 | hung_task_timeout_secs |
| 373 | ====================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 374 | |
| 375 | When a task in D state did not get scheduled |
| 376 | for more than this value report a warning. |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 377 | This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 378 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 379 | 0 means infinite timeout, no checking is done. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 380 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 381 | Possible values to set are in range {0:``LONG_MAX``/``HZ``}. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 382 | |
| 383 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 384 | hung_task_check_interval_secs |
| 385 | ============================= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 386 | |
| 387 | Hung task check interval. If hung task checking is enabled |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 388 | (see `hung_task_timeout_secs`_), the check is done every |
| 389 | ``hung_task_check_interval_secs`` seconds. |
| 390 | This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 391 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 392 | 0 (default) means use ``hung_task_timeout_secs`` as checking |
| 393 | interval. |
| 394 | |
| 395 | Possible values to set are in range {0:``LONG_MAX``/``HZ``}. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 396 | |
| 397 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 398 | hung_task_warnings |
| 399 | ================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 400 | |
| 401 | The maximum number of warnings to report. During a check interval |
| 402 | if a hung task is detected, this value is decreased by 1. |
| 403 | When this value reaches 0, no more warnings will be reported. |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 404 | This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 405 | |
| 406 | -1: report an infinite number of warnings. |
| 407 | |
| 408 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 409 | hyperv_record_panic_msg |
| 410 | ======================= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 411 | |
| 412 | Controls whether the panic kmsg data should be reported to Hyper-V. |
| 413 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 414 | = ========================================================= |
| 415 | 0 Do not report panic kmsg data. |
| 416 | 1 Report the panic kmsg data. This is the default behavior. |
| 417 | = ========================================================= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 418 | |
| 419 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 420 | ignore-unaligned-usertrap |
| 421 | ========================= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 422 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 423 | On architectures where unaligned accesses cause traps, and where this |
| 424 | feature is supported (``CONFIG_SYSCTL_ARCH_UNALIGN_NO_WARN``; |
| 425 | currently, ``arc`` and ``ia64``), controls whether all unaligned traps |
| 426 | are logged. |
| 427 | |
| 428 | = ============================================================= |
| 429 | 0 Log all unaligned accesses. |
| 430 | 1 Only warn the first time a process traps. This is the default |
| 431 | setting. |
| 432 | = ============================================================= |
| 433 | |
| 434 | See also `unaligned-trap`_ and `unaligned-dump-stack`_. On ``ia64``, |
| 435 | this allows system administrators to override the |
| 436 | ``IA64_THREAD_UAC_NOPRINT`` ``prctl`` and avoid logs being flooded. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 437 | |
| 438 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 439 | kexec_load_disabled |
| 440 | =================== |
| 441 | |
| 442 | A toggle indicating if the ``kexec_load`` syscall has been disabled. |
| 443 | This value defaults to 0 (false: ``kexec_load`` enabled), but can be |
| 444 | set to 1 (true: ``kexec_load`` disabled). |
| 445 | Once true, kexec can no longer be used, and the toggle cannot be set |
| 446 | back to false. |
| 447 | This allows a kexec image to be loaded before disabling the syscall, |
| 448 | allowing a system to set up (and later use) an image without it being |
| 449 | altered. |
| 450 | Generally used together with the `modules_disabled`_ sysctl. |
| 451 | |
| 452 | |
| 453 | kptr_restrict |
| 454 | ============= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 455 | |
| 456 | This toggle indicates whether restrictions are placed on |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 457 | exposing kernel addresses via ``/proc`` and other interfaces. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 458 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 459 | When ``kptr_restrict`` is set to 0 (the default) the address is hashed |
| 460 | before printing. |
| 461 | (This is the equivalent to %p.) |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 462 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 463 | When ``kptr_restrict`` is set to 1, kernel pointers printed using the |
| 464 | %pK format specifier will be replaced with 0s unless the user has |
| 465 | ``CAP_SYSLOG`` and effective user and group ids are equal to the real |
| 466 | ids. |
| 467 | This is because %pK checks are done at read() time rather than open() |
| 468 | time, so if permissions are elevated between the open() and the read() |
| 469 | (e.g via a setuid binary) then %pK will not leak kernel pointers to |
| 470 | unprivileged users. |
| 471 | Note, this is a temporary solution only. |
| 472 | The correct long-term solution is to do the permission checks at |
| 473 | open() time. |
| 474 | Consider removing world read permissions from files that use %pK, and |
| 475 | using `dmesg_restrict`_ to protect against uses of %pK in ``dmesg(8)`` |
| 476 | if leaking kernel pointer values to unprivileged users is a concern. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 477 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 478 | When ``kptr_restrict`` is set to 2, kernel pointers printed using |
| 479 | %pK will be replaced with 0s regardless of privileges. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 480 | |
| 481 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 482 | modprobe |
| 483 | ======== |
| 484 | |
| 485 | The full path to the usermode helper for autoloading kernel modules, |
| 486 | by default "/sbin/modprobe". This binary is executed when the kernel |
| 487 | requests a module. For example, if userspace passes an unknown |
| 488 | filesystem type to mount(), then the kernel will automatically request |
| 489 | the corresponding filesystem module by executing this usermode helper. |
| 490 | This usermode helper should insert the needed module into the kernel. |
| 491 | |
| 492 | This sysctl only affects module autoloading. It has no effect on the |
| 493 | ability to explicitly insert modules. |
| 494 | |
| 495 | This sysctl can be used to debug module loading requests:: |
| 496 | |
| 497 | echo '#! /bin/sh' > /tmp/modprobe |
| 498 | echo 'echo "$@" >> /tmp/modprobe.log' >> /tmp/modprobe |
| 499 | echo 'exec /sbin/modprobe "$@"' >> /tmp/modprobe |
| 500 | chmod a+x /tmp/modprobe |
| 501 | echo /tmp/modprobe > /proc/sys/kernel/modprobe |
| 502 | |
| 503 | Alternatively, if this sysctl is set to the empty string, then module |
| 504 | autoloading is completely disabled. The kernel will not try to |
| 505 | execute a usermode helper at all, nor will it call the |
| 506 | kernel_module_request LSM hook. |
| 507 | |
| 508 | If CONFIG_STATIC_USERMODEHELPER=y is set in the kernel configuration, |
| 509 | then the configured static usermode helper overrides this sysctl, |
| 510 | except that the empty string is still accepted to completely disable |
| 511 | module autoloading as described above. |
| 512 | |
| 513 | modules_disabled |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 514 | ================ |
| 515 | |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 516 | A toggle value indicating if modules are allowed to be loaded |
| 517 | in an otherwise modular kernel. This toggle defaults to off |
| 518 | (0), but can be set true (1). Once true, modules can be |
| 519 | neither loaded nor unloaded, and the toggle cannot be set back |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 520 | to false. Generally used with the `kexec_load_disabled`_ toggle. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 521 | |
| 522 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 523 | .. _msgmni: |
| 524 | |
| 525 | msgmax, msgmnb, and msgmni |
| 526 | ========================== |
| 527 | |
| 528 | ``msgmax`` is the maximum size of an IPC message, in bytes. 8192 by |
| 529 | default (``MSGMAX``). |
| 530 | |
| 531 | ``msgmnb`` is the maximum size of an IPC queue, in bytes. 16384 by |
| 532 | default (``MSGMNB``). |
| 533 | |
| 534 | ``msgmni`` is the maximum number of IPC queues. 32000 by default |
| 535 | (``MSGMNI``). |
| 536 | |
| 537 | |
| 538 | msg_next_id, sem_next_id, and shm_next_id (System V IPC) |
| 539 | ======================================================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 540 | |
| 541 | These three toggles allows to specify desired id for next allocated IPC |
| 542 | object: message, semaphore or shared memory respectively. |
| 543 | |
| 544 | By default they are equal to -1, which means generic allocation logic. |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 545 | Possible values to set are in range {0:``INT_MAX``}. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 546 | |
| 547 | Notes: |
| 548 | 1) kernel doesn't guarantee, that new object will have desired id. So, |
| 549 | it's up to userspace, how to handle an object with "wrong" id. |
| 550 | 2) Toggle with non-default value will be set back to -1 by kernel after |
| 551 | successful IPC object allocation. If an IPC object allocation syscall |
| 552 | fails, it is undefined if the value remains unmodified or is reset to -1. |
| 553 | |
| 554 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 555 | ngroups_max |
| 556 | =========== |
| 557 | |
| 558 | Maximum number of supplementary groups, _i.e._ the maximum size which |
| 559 | ``setgroups`` will accept. Exports ``NGROUPS_MAX`` from the kernel. |
| 560 | |
| 561 | |
| 562 | |
| 563 | nmi_watchdog |
| 564 | ============ |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 565 | |
| 566 | This parameter can be used to control the NMI watchdog |
| 567 | (i.e. the hard lockup detector) on x86 systems. |
| 568 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 569 | = ================================= |
| 570 | 0 Disable the hard lockup detector. |
| 571 | 1 Enable the hard lockup detector. |
| 572 | = ================================= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 573 | |
| 574 | The hard lockup detector monitors each CPU for its ability to respond to |
| 575 | timer interrupts. The mechanism utilizes CPU performance counter registers |
| 576 | that are programmed to generate Non-Maskable Interrupts (NMIs) periodically |
| 577 | while a CPU is busy. Hence, the alternative name 'NMI watchdog'. |
| 578 | |
| 579 | The NMI watchdog is disabled by default if the kernel is running as a guest |
| 580 | in a KVM virtual machine. This default can be overridden by adding:: |
| 581 | |
| 582 | nmi_watchdog=1 |
| 583 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 584 | to the guest kernel command line (see :doc:`/admin-guide/kernel-parameters`). |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 585 | |
| 586 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 587 | numa_balancing |
| 588 | ============== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 589 | |
| 590 | Enables/disables automatic page fault based NUMA memory |
| 591 | balancing. Memory is moved automatically to nodes |
| 592 | that access it often. |
| 593 | |
| 594 | Enables/disables automatic NUMA memory balancing. On NUMA machines, there |
| 595 | is a performance penalty if remote memory is accessed by a CPU. When this |
| 596 | feature is enabled the kernel samples what task thread is accessing memory |
| 597 | by periodically unmapping pages and later trapping a page fault. At the |
| 598 | time of the page fault, it is determined if the data being accessed should |
| 599 | be migrated to a local memory node. |
| 600 | |
| 601 | The unmapping of pages and trapping faults incur additional overhead that |
| 602 | ideally is offset by improved memory locality but there is no universal |
| 603 | guarantee. If the target workload is already bound to NUMA nodes then this |
| 604 | feature should be disabled. Otherwise, if the system overhead from the |
| 605 | feature is too high then the rate the kernel samples for NUMA hinting |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 606 | faults may be controlled by the `numa_balancing_scan_period_min_ms, |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 607 | numa_balancing_scan_delay_ms, numa_balancing_scan_period_max_ms, |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 608 | numa_balancing_scan_size_mb`_, and numa_balancing_settle_count sysctls. |
| 609 | |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 610 | |
| 611 | numa_balancing_scan_period_min_ms, numa_balancing_scan_delay_ms, numa_balancing_scan_period_max_ms, numa_balancing_scan_size_mb |
| 612 | =============================================================================================================================== |
| 613 | |
| 614 | |
| 615 | Automatic NUMA balancing scans tasks address space and unmaps pages to |
| 616 | detect if pages are properly placed or if the data should be migrated to a |
| 617 | memory node local to where the task is running. Every "scan delay" the task |
| 618 | scans the next "scan size" number of pages in its address space. When the |
| 619 | end of the address space is reached the scanner restarts from the beginning. |
| 620 | |
| 621 | In combination, the "scan delay" and "scan size" determine the scan rate. |
| 622 | When "scan delay" decreases, the scan rate increases. The scan delay and |
| 623 | hence the scan rate of every task is adaptive and depends on historical |
| 624 | behaviour. If pages are properly placed then the scan delay increases, |
| 625 | otherwise the scan delay decreases. The "scan size" is not adaptive but |
| 626 | the higher the "scan size", the higher the scan rate. |
| 627 | |
| 628 | Higher scan rates incur higher system overhead as page faults must be |
| 629 | trapped and potentially data must be migrated. However, the higher the scan |
| 630 | rate, the more quickly a tasks memory is migrated to a local node if the |
| 631 | workload pattern changes and minimises performance impact due to remote |
| 632 | memory accesses. These sysctls control the thresholds for scan delays and |
| 633 | the number of pages scanned. |
| 634 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 635 | ``numa_balancing_scan_period_min_ms`` is the minimum time in milliseconds to |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 636 | scan a tasks virtual memory. It effectively controls the maximum scanning |
| 637 | rate for each task. |
| 638 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 639 | ``numa_balancing_scan_delay_ms`` is the starting "scan delay" used for a task |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 640 | when it initially forks. |
| 641 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 642 | ``numa_balancing_scan_period_max_ms`` is the maximum time in milliseconds to |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 643 | scan a tasks virtual memory. It effectively controls the minimum scanning |
| 644 | rate for each task. |
| 645 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 646 | ``numa_balancing_scan_size_mb`` is how many megabytes worth of pages are |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 647 | scanned for a given scan. |
| 648 | |
| 649 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 650 | oops_all_cpu_backtrace |
| 651 | ====================== |
| 652 | |
| 653 | If this option is set, the kernel will send an NMI to all CPUs to dump |
| 654 | their backtraces when an oops event occurs. It should be used as a last |
| 655 | resort in case a panic cannot be triggered (to protect VMs running, for |
| 656 | example) or kdump can't be collected. This file shows up if CONFIG_SMP |
| 657 | is enabled. |
| 658 | |
| 659 | 0: Won't show all CPUs backtraces when an oops is detected. |
| 660 | This is the default behavior. |
| 661 | |
| 662 | 1: Will non-maskably interrupt all CPUs and dump their backtraces when |
| 663 | an oops event is detected. |
| 664 | |
| 665 | |
| 666 | osrelease, ostype & version |
| 667 | =========================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 668 | |
| 669 | :: |
| 670 | |
| 671 | # cat osrelease |
| 672 | 2.1.88 |
| 673 | # cat ostype |
| 674 | Linux |
| 675 | # cat version |
| 676 | #5 Wed Feb 25 21:49:24 MET 1998 |
| 677 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 678 | The files ``osrelease`` and ``ostype`` should be clear enough. |
| 679 | ``version`` |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 680 | needs a little more clarification however. The '#5' means that |
| 681 | this is the fifth kernel built from this source base and the |
| 682 | date behind it indicates the time the kernel was built. |
| 683 | The only way to tune these values is to rebuild the kernel :-) |
| 684 | |
| 685 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 686 | overflowgid & overflowuid |
| 687 | ========================= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 688 | |
| 689 | if your architecture did not always support 32-bit UIDs (i.e. arm, |
| 690 | i386, m68k, sh, and sparc32), a fixed UID and GID will be returned to |
| 691 | applications that use the old 16-bit UID/GID system calls, if the |
| 692 | actual UID or GID would exceed 65535. |
| 693 | |
| 694 | These sysctls allow you to change the value of the fixed UID and GID. |
| 695 | The default is 65534. |
| 696 | |
| 697 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 698 | panic |
| 699 | ===== |
| 700 | |
| 701 | The value in this file determines the behaviour of the kernel on a |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 702 | panic: |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 703 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 704 | * if zero, the kernel will loop forever; |
| 705 | * if negative, the kernel will reboot immediately; |
| 706 | * if positive, the kernel will reboot after the corresponding number |
| 707 | of seconds. |
| 708 | |
| 709 | When you use the software watchdog, the recommended setting is 60. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 710 | |
| 711 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 712 | panic_on_io_nmi |
| 713 | =============== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 714 | |
| 715 | Controls the kernel's behavior when a CPU receives an NMI caused by |
| 716 | an IO error. |
| 717 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 718 | = ================================================================== |
| 719 | 0 Try to continue operation (default). |
| 720 | 1 Panic immediately. The IO error triggered an NMI. This indicates a |
| 721 | serious system condition which could result in IO data corruption. |
| 722 | Rather than continuing, panicking might be a better choice. Some |
| 723 | servers issue this sort of NMI when the dump button is pushed, |
| 724 | and you can use this option to take a crash dump. |
| 725 | = ================================================================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 726 | |
| 727 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 728 | panic_on_oops |
| 729 | ============= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 730 | |
| 731 | Controls the kernel's behaviour when an oops or BUG is encountered. |
| 732 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 733 | = =================================================================== |
| 734 | 0 Try to continue operation. |
| 735 | 1 Panic immediately. If the `panic` sysctl is also non-zero then the |
| 736 | machine will be rebooted. |
| 737 | = =================================================================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 738 | |
| 739 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 740 | panic_on_stackoverflow |
| 741 | ====================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 742 | |
| 743 | Controls the kernel's behavior when detecting the overflows of |
| 744 | kernel, IRQ and exception stacks except a user stack. |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 745 | This file shows up if ``CONFIG_DEBUG_STACKOVERFLOW`` is enabled. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 746 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 747 | = ========================== |
| 748 | 0 Try to continue operation. |
| 749 | 1 Panic immediately. |
| 750 | = ========================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 751 | |
| 752 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 753 | panic_on_unrecovered_nmi |
| 754 | ======================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 755 | |
| 756 | The default Linux behaviour on an NMI of either memory or unknown is |
| 757 | to continue operation. For many environments such as scientific |
| 758 | computing it is preferable that the box is taken out and the error |
| 759 | dealt with than an uncorrected parity/ECC error get propagated. |
| 760 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 761 | A small number of systems do generate NMIs for bizarre random reasons |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 762 | such as power management so the default is off. That sysctl works like |
| 763 | the existing panic controls already in that directory. |
| 764 | |
| 765 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 766 | panic_on_warn |
| 767 | ============= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 768 | |
| 769 | Calls panic() in the WARN() path when set to 1. This is useful to avoid |
| 770 | a kernel rebuild when attempting to kdump at the location of a WARN(). |
| 771 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 772 | = ================================================ |
| 773 | 0 Only WARN(), default behaviour. |
| 774 | 1 Call panic() after printing out WARN() location. |
| 775 | = ================================================ |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 776 | |
| 777 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 778 | panic_print |
| 779 | =========== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 780 | |
| 781 | Bitmask for printing system info when panic happens. User can chose |
| 782 | combination of the following bits: |
| 783 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 784 | ===== ============================================ |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 785 | bit 0 print all tasks info |
| 786 | bit 1 print system memory info |
| 787 | bit 2 print timer info |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 788 | bit 3 print locks info if ``CONFIG_LOCKDEP`` is on |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 789 | bit 4 print ftrace buffer |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 790 | ===== ============================================ |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 791 | |
| 792 | So for example to print tasks and memory info on panic, user can:: |
| 793 | |
| 794 | echo 3 > /proc/sys/kernel/panic_print |
| 795 | |
| 796 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 797 | panic_on_rcu_stall |
| 798 | ================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 799 | |
| 800 | When set to 1, calls panic() after RCU stall detection messages. This |
| 801 | is useful to define the root cause of RCU stalls using a vmcore. |
| 802 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 803 | = ============================================================ |
| 804 | 0 Do not panic() when RCU stall takes place, default behavior. |
| 805 | 1 panic() after printing RCU stall messages. |
| 806 | = ============================================================ |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 807 | |
| 808 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 809 | perf_cpu_time_max_percent |
| 810 | ========================= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 811 | |
| 812 | Hints to the kernel how much CPU time it should be allowed to |
| 813 | use to handle perf sampling events. If the perf subsystem |
| 814 | is informed that its samples are exceeding this limit, it |
| 815 | will drop its sampling frequency to attempt to reduce its CPU |
| 816 | usage. |
| 817 | |
| 818 | Some perf sampling happens in NMIs. If these samples |
| 819 | unexpectedly take too long to execute, the NMIs can become |
| 820 | stacked up next to each other so much that nothing else is |
| 821 | allowed to execute. |
| 822 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 823 | ===== ======================================================== |
| 824 | 0 Disable the mechanism. Do not monitor or correct perf's |
| 825 | sampling rate no matter how CPU time it takes. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 826 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 827 | 1-100 Attempt to throttle perf's sample rate to this |
| 828 | percentage of CPU. Note: the kernel calculates an |
| 829 | "expected" length of each sample event. 100 here means |
| 830 | 100% of that expected length. Even if this is set to |
| 831 | 100, you may still see sample throttling if this |
| 832 | length is exceeded. Set to 0 if you truly do not care |
| 833 | how much CPU is consumed. |
| 834 | ===== ======================================================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 835 | |
| 836 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 837 | perf_event_paranoid |
| 838 | =================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 839 | |
| 840 | Controls use of the performance events system by unprivileged |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 841 | users (without CAP_PERFMON). The default value is 2. |
| 842 | |
| 843 | For backward compatibility reasons access to system performance |
| 844 | monitoring and observability remains open for CAP_SYS_ADMIN |
| 845 | privileged processes but CAP_SYS_ADMIN usage for secure system |
| 846 | performance monitoring and observability operations is discouraged |
| 847 | with respect to CAP_PERFMON use cases. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 848 | |
| 849 | === ================================================================== |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 850 | -1 Allow use of (almost) all events by all users. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 851 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 852 | Ignore mlock limit after perf_event_mlock_kb without |
| 853 | ``CAP_IPC_LOCK``. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 854 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 855 | >=0 Disallow ftrace function tracepoint by users without |
| 856 | ``CAP_PERFMON``. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 857 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 858 | Disallow raw tracepoint access by users without ``CAP_PERFMON``. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 859 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 860 | >=1 Disallow CPU event access by users without ``CAP_PERFMON``. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 861 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 862 | >=2 Disallow kernel profiling by users without ``CAP_PERFMON``. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 863 | === ================================================================== |
| 864 | |
| 865 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 866 | perf_event_max_stack |
| 867 | ==================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 868 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 869 | Controls maximum number of stack frames to copy for (``attr.sample_type & |
| 870 | PERF_SAMPLE_CALLCHAIN``) configured events, for instance, when using |
| 871 | '``perf record -g``' or '``perf trace --call-graph fp``'. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 872 | |
| 873 | This can only be done when no events are in use that have callchains |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 874 | enabled, otherwise writing to this file will return ``-EBUSY``. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 875 | |
| 876 | The default value is 127. |
| 877 | |
| 878 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 879 | perf_event_mlock_kb |
| 880 | =================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 881 | |
| 882 | Control size of per-cpu ring buffer not counted agains mlock limit. |
| 883 | |
| 884 | The default value is 512 + 1 page |
| 885 | |
| 886 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 887 | perf_event_max_contexts_per_stack |
| 888 | ================================= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 889 | |
| 890 | Controls maximum number of stack frame context entries for |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 891 | (``attr.sample_type & PERF_SAMPLE_CALLCHAIN``) configured events, for |
| 892 | instance, when using '``perf record -g``' or '``perf trace --call-graph fp``'. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 893 | |
| 894 | This can only be done when no events are in use that have callchains |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 895 | enabled, otherwise writing to this file will return ``-EBUSY``. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 896 | |
| 897 | The default value is 8. |
| 898 | |
| 899 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 900 | pid_max |
| 901 | ======= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 902 | |
| 903 | PID allocation wrap value. When the kernel's next PID value |
| 904 | reaches this value, it wraps back to a minimum PID value. |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 905 | PIDs of value ``pid_max`` or larger are not allocated. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 906 | |
| 907 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 908 | ns_last_pid |
| 909 | =========== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 910 | |
| 911 | The last pid allocated in the current (the one task using this sysctl |
| 912 | lives in) pid namespace. When selecting a pid for a next task on fork |
| 913 | kernel tries to allocate a number starting from this one. |
| 914 | |
| 915 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 916 | powersave-nap (PPC only) |
| 917 | ======================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 918 | |
| 919 | If set, Linux-PPC will use the 'nap' mode of powersaving, |
| 920 | otherwise the 'doze' mode will be used. |
| 921 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 922 | |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 923 | ============================================================== |
| 924 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 925 | printk |
| 926 | ====== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 927 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 928 | The four values in printk denote: ``console_loglevel``, |
| 929 | ``default_message_loglevel``, ``minimum_console_loglevel`` and |
| 930 | ``default_console_loglevel`` respectively. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 931 | |
| 932 | These values influence printk() behavior when printing or |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 933 | logging error messages. See '``man 2 syslog``' for more info on |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 934 | the different loglevels. |
| 935 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 936 | ======================== ===================================== |
| 937 | console_loglevel messages with a higher priority than |
| 938 | this will be printed to the console |
| 939 | default_message_loglevel messages without an explicit priority |
| 940 | will be printed with this priority |
| 941 | minimum_console_loglevel minimum (highest) value to which |
| 942 | console_loglevel can be set |
| 943 | default_console_loglevel default value for console_loglevel |
| 944 | ======================== ===================================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 945 | |
| 946 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 947 | printk_delay |
| 948 | ============ |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 949 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 950 | Delay each printk message in ``printk_delay`` milliseconds |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 951 | |
| 952 | Value from 0 - 10000 is allowed. |
| 953 | |
| 954 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 955 | printk_ratelimit |
| 956 | ================ |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 957 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 958 | Some warning messages are rate limited. ``printk_ratelimit`` specifies |
| 959 | the minimum length of time between these messages (in seconds). |
| 960 | The default value is 5 seconds. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 961 | |
| 962 | A value of 0 will disable rate limiting. |
| 963 | |
| 964 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 965 | printk_ratelimit_burst |
| 966 | ====================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 967 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 968 | While long term we enforce one message per `printk_ratelimit`_ |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 969 | seconds, we do allow a burst of messages to pass through. |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 970 | ``printk_ratelimit_burst`` specifies the number of messages we can |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 971 | send before ratelimiting kicks in. |
| 972 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 973 | The default value is 10 messages. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 974 | |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 975 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 976 | printk_devkmsg |
| 977 | ============== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 978 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 979 | Control the logging to ``/dev/kmsg`` from userspace: |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 980 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 981 | ========= ============================================= |
| 982 | ratelimit default, ratelimited |
| 983 | on unlimited logging to /dev/kmsg from userspace |
| 984 | off logging to /dev/kmsg disabled |
| 985 | ========= ============================================= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 986 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 987 | The kernel command line parameter ``printk.devkmsg=`` overrides this and is |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 988 | a one-time setting until next reboot: once set, it cannot be changed by |
| 989 | this sysctl interface anymore. |
| 990 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 991 | ============================================================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 992 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 993 | |
| 994 | pty |
| 995 | === |
| 996 | |
| 997 | See Documentation/filesystems/devpts.rst. |
| 998 | |
| 999 | |
| 1000 | random |
| 1001 | ====== |
| 1002 | |
| 1003 | This is a directory, with the following entries: |
| 1004 | |
| 1005 | * ``boot_id``: a UUID generated the first time this is retrieved, and |
| 1006 | unvarying after that; |
| 1007 | |
| 1008 | * ``entropy_avail``: the pool's entropy count, in bits; |
| 1009 | |
| 1010 | * ``poolsize``: the entropy pool size, in bits; |
| 1011 | |
| 1012 | * ``urandom_min_reseed_secs``: obsolete (used to determine the minimum |
| 1013 | number of seconds between urandom pool reseeding). |
| 1014 | |
| 1015 | * ``uuid``: a UUID generated every time this is retrieved (this can |
| 1016 | thus be used to generate UUIDs at will); |
| 1017 | |
| 1018 | * ``write_wakeup_threshold``: when the entropy count drops below this |
| 1019 | (as a number of bits), processes waiting to write to ``/dev/random`` |
| 1020 | are woken up. |
| 1021 | |
| 1022 | If ``drivers/char/random.c`` is built with ``ADD_INTERRUPT_BENCH`` |
| 1023 | defined, these additional entries are present: |
| 1024 | |
| 1025 | * ``add_interrupt_avg_cycles``: the average number of cycles between |
| 1026 | interrupts used to feed the pool; |
| 1027 | |
| 1028 | * ``add_interrupt_avg_deviation``: the standard deviation seen on the |
| 1029 | number of cycles between interrupts used to feed the pool. |
| 1030 | |
| 1031 | |
| 1032 | randomize_va_space |
| 1033 | ================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1034 | |
| 1035 | This option can be used to select the type of process address |
| 1036 | space randomization that is used in the system, for architectures |
| 1037 | that support this feature. |
| 1038 | |
| 1039 | == =========================================================================== |
| 1040 | 0 Turn the process address space randomization off. This is the |
| 1041 | default for architectures that do not support this feature anyways, |
| 1042 | and kernels that are booted with the "norandmaps" parameter. |
| 1043 | |
| 1044 | 1 Make the addresses of mmap base, stack and VDSO page randomized. |
| 1045 | This, among other things, implies that shared libraries will be |
| 1046 | loaded to random addresses. Also for PIE-linked binaries, the |
| 1047 | location of code start is randomized. This is the default if the |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1048 | ``CONFIG_COMPAT_BRK`` option is enabled. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1049 | |
| 1050 | 2 Additionally enable heap randomization. This is the default if |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1051 | ``CONFIG_COMPAT_BRK`` is disabled. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1052 | |
| 1053 | There are a few legacy applications out there (such as some ancient |
| 1054 | versions of libc.so.5 from 1996) that assume that brk area starts |
| 1055 | just after the end of the code+bss. These applications break when |
| 1056 | start of the brk area is randomized. There are however no known |
| 1057 | non-legacy applications that would be broken this way, so for most |
| 1058 | systems it is safe to choose full randomization. |
| 1059 | |
| 1060 | Systems with ancient and/or broken binaries should be configured |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1061 | with ``CONFIG_COMPAT_BRK`` enabled, which excludes the heap from process |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1062 | address space randomization. |
| 1063 | == =========================================================================== |
| 1064 | |
| 1065 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1066 | real-root-dev |
| 1067 | ============= |
| 1068 | |
| 1069 | See :doc:`/admin-guide/initrd`. |
| 1070 | |
| 1071 | |
| 1072 | reboot-cmd (SPARC only) |
| 1073 | ======================= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1074 | |
| 1075 | ??? This seems to be a way to give an argument to the Sparc |
| 1076 | ROM/Flash boot loader. Maybe to tell it what to do after |
| 1077 | rebooting. ??? |
| 1078 | |
| 1079 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1080 | sched_energy_aware |
| 1081 | ================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1082 | |
| 1083 | Enables/disables Energy Aware Scheduling (EAS). EAS starts |
| 1084 | automatically on platforms where it can run (that is, |
| 1085 | platforms with asymmetric CPU topologies and having an Energy |
| 1086 | Model available). If your platform happens to meet the |
| 1087 | requirements for EAS but you do not want to use it, change |
| 1088 | this value to 0. |
| 1089 | |
| 1090 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1091 | sched_schedstats |
| 1092 | ================ |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1093 | |
| 1094 | Enables/disables scheduler statistics. Enabling this feature |
| 1095 | incurs a small amount of overhead in the scheduler but is |
| 1096 | useful for debugging and performance tuning. |
| 1097 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1098 | sched_util_clamp_min: |
| 1099 | ===================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1100 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1101 | Max allowed *minimum* utilization. |
| 1102 | |
| 1103 | Default value is 1024, which is the maximum possible value. |
| 1104 | |
| 1105 | It means that any requested uclamp.min value cannot be greater than |
| 1106 | sched_util_clamp_min, i.e., it is restricted to the range |
| 1107 | [0:sched_util_clamp_min]. |
| 1108 | |
| 1109 | sched_util_clamp_max: |
| 1110 | ===================== |
| 1111 | |
| 1112 | Max allowed *maximum* utilization. |
| 1113 | |
| 1114 | Default value is 1024, which is the maximum possible value. |
| 1115 | |
| 1116 | It means that any requested uclamp.max value cannot be greater than |
| 1117 | sched_util_clamp_max, i.e., it is restricted to the range |
| 1118 | [0:sched_util_clamp_max]. |
| 1119 | |
| 1120 | sched_util_clamp_min_rt_default: |
| 1121 | ================================ |
| 1122 | |
| 1123 | By default Linux is tuned for performance. Which means that RT tasks always run |
| 1124 | at the highest frequency and most capable (highest capacity) CPU (in |
| 1125 | heterogeneous systems). |
| 1126 | |
| 1127 | Uclamp achieves this by setting the requested uclamp.min of all RT tasks to |
| 1128 | 1024 by default, which effectively boosts the tasks to run at the highest |
| 1129 | frequency and biases them to run on the biggest CPU. |
| 1130 | |
| 1131 | This knob allows admins to change the default behavior when uclamp is being |
| 1132 | used. In battery powered devices particularly, running at the maximum |
| 1133 | capacity and frequency will increase energy consumption and shorten the battery |
| 1134 | life. |
| 1135 | |
| 1136 | This knob is only effective for RT tasks which the user hasn't modified their |
| 1137 | requested uclamp.min value via sched_setattr() syscall. |
| 1138 | |
| 1139 | This knob will not escape the range constraint imposed by sched_util_clamp_min |
| 1140 | defined above. |
| 1141 | |
| 1142 | For example if |
| 1143 | |
| 1144 | sched_util_clamp_min_rt_default = 800 |
| 1145 | sched_util_clamp_min = 600 |
| 1146 | |
| 1147 | Then the boost will be clamped to 600 because 800 is outside of the permissible |
| 1148 | range of [0:600]. This could happen for instance if a powersave mode will |
| 1149 | restrict all boosts temporarily by modifying sched_util_clamp_min. As soon as |
| 1150 | this restriction is lifted, the requested sched_util_clamp_min_rt_default |
| 1151 | will take effect. |
| 1152 | |
| 1153 | seccomp |
| 1154 | ======= |
| 1155 | |
| 1156 | See :doc:`/userspace-api/seccomp_filter`. |
| 1157 | |
| 1158 | |
| 1159 | sg-big-buff |
| 1160 | =========== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1161 | |
| 1162 | This file shows the size of the generic SCSI (sg) buffer. |
| 1163 | You can't tune it just yet, but you could change it on |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1164 | compile time by editing ``include/scsi/sg.h`` and changing |
| 1165 | the value of ``SG_BIG_BUFF``. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1166 | |
| 1167 | There shouldn't be any reason to change this value. If |
| 1168 | you can come up with one, you probably know what you |
| 1169 | are doing anyway :) |
| 1170 | |
| 1171 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1172 | shmall |
| 1173 | ====== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1174 | |
| 1175 | This parameter sets the total amount of shared memory pages that |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1176 | can be used system wide. Hence, ``shmall`` should always be at least |
| 1177 | ``ceil(shmmax/PAGE_SIZE)``. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1178 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1179 | If you are not sure what the default ``PAGE_SIZE`` is on your Linux |
| 1180 | system, you can run the following command:: |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1181 | |
| 1182 | # getconf PAGE_SIZE |
| 1183 | |
| 1184 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1185 | shmmax |
| 1186 | ====== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1187 | |
| 1188 | This value can be used to query and set the run time limit |
| 1189 | on the maximum shared memory segment size that can be created. |
| 1190 | Shared memory segments up to 1Gb are now supported in the |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1191 | kernel. This value defaults to ``SHMMAX``. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1192 | |
| 1193 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1194 | shmmni |
| 1195 | ====== |
| 1196 | |
| 1197 | This value determines the maximum number of shared memory segments. |
| 1198 | 4096 by default (``SHMMNI``). |
| 1199 | |
| 1200 | |
| 1201 | shm_rmid_forced |
| 1202 | =============== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1203 | |
| 1204 | Linux lets you set resource limits, including how much memory one |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1205 | process can consume, via ``setrlimit(2)``. Unfortunately, shared memory |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1206 | segments are allowed to exist without association with any process, and |
| 1207 | thus might not be counted against any resource limits. If enabled, |
| 1208 | shared memory segments are automatically destroyed when their attach |
| 1209 | count becomes zero after a detach or a process termination. It will |
| 1210 | also destroy segments that were created, but never attached to, on exit |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1211 | from the process. The only use left for ``IPC_RMID`` is to immediately |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1212 | destroy an unattached segment. Of course, this breaks the way things are |
| 1213 | defined, so some applications might stop working. Note that this |
| 1214 | feature will do you no good unless you also configure your resource |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1215 | limits (in particular, ``RLIMIT_AS`` and ``RLIMIT_NPROC``). Most systems don't |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1216 | need this. |
| 1217 | |
| 1218 | Note that if you change this from 0 to 1, already created segments |
| 1219 | without users and with a dead originative process will be destroyed. |
| 1220 | |
| 1221 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1222 | sysctl_writes_strict |
| 1223 | ==================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1224 | |
| 1225 | Control how file position affects the behavior of updating sysctl values |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1226 | via the ``/proc/sys`` interface: |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1227 | |
| 1228 | == ====================================================================== |
| 1229 | -1 Legacy per-write sysctl value handling, with no printk warnings. |
| 1230 | Each write syscall must fully contain the sysctl value to be |
| 1231 | written, and multiple writes on the same sysctl file descriptor |
| 1232 | will rewrite the sysctl value, regardless of file position. |
| 1233 | 0 Same behavior as above, but warn about processes that perform writes |
| 1234 | to a sysctl file descriptor when the file position is not 0. |
| 1235 | 1 (default) Respect file position when writing sysctl strings. Multiple |
| 1236 | writes will append to the sysctl value buffer. Anything past the max |
| 1237 | length of the sysctl value buffer will be ignored. Writes to numeric |
| 1238 | sysctl entries must always be at file position 0 and the value must |
| 1239 | be fully contained in the buffer sent in the write syscall. |
| 1240 | == ====================================================================== |
| 1241 | |
| 1242 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1243 | softlockup_all_cpu_backtrace |
| 1244 | ============================ |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1245 | |
| 1246 | This value controls the soft lockup detector thread's behavior |
| 1247 | when a soft lockup condition is detected as to whether or not |
| 1248 | to gather further debug information. If enabled, each cpu will |
| 1249 | be issued an NMI and instructed to capture stack trace. |
| 1250 | |
| 1251 | This feature is only applicable for architectures which support |
| 1252 | NMI. |
| 1253 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1254 | = ============================================ |
| 1255 | 0 Do nothing. This is the default behavior. |
| 1256 | 1 On detection capture more debug information. |
| 1257 | = ============================================ |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1258 | |
| 1259 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1260 | softlockup_panic |
| 1261 | ================= |
| 1262 | |
| 1263 | This parameter can be used to control whether the kernel panics |
| 1264 | when a soft lockup is detected. |
| 1265 | |
| 1266 | = ============================================ |
| 1267 | 0 Don't panic on soft lockup. |
| 1268 | 1 Panic on soft lockup. |
| 1269 | = ============================================ |
| 1270 | |
| 1271 | This can also be set using the softlockup_panic kernel parameter. |
| 1272 | |
| 1273 | |
| 1274 | soft_watchdog |
| 1275 | ============= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1276 | |
| 1277 | This parameter can be used to control the soft lockup detector. |
| 1278 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1279 | = ================================= |
| 1280 | 0 Disable the soft lockup detector. |
| 1281 | 1 Enable the soft lockup detector. |
| 1282 | = ================================= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1283 | |
| 1284 | The soft lockup detector monitors CPUs for threads that are hogging the CPUs |
| 1285 | without rescheduling voluntarily, and thus prevent the 'watchdog/N' threads |
| 1286 | from running. The mechanism depends on the CPUs ability to respond to timer |
| 1287 | interrupts which are needed for the 'watchdog/N' threads to be woken up by |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1288 | the watchdog timer function, otherwise the NMI watchdog — if enabled — can |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1289 | detect a hard lockup condition. |
| 1290 | |
| 1291 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1292 | stack_erasing |
| 1293 | ============= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1294 | |
| 1295 | This parameter can be used to control kernel stack erasing at the end |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1296 | of syscalls for kernels built with ``CONFIG_GCC_PLUGIN_STACKLEAK``. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1297 | |
| 1298 | That erasing reduces the information which kernel stack leak bugs |
| 1299 | can reveal and blocks some uninitialized stack variable attacks. |
| 1300 | The tradeoff is the performance impact: on a single CPU system kernel |
| 1301 | compilation sees a 1% slowdown, other systems and workloads may vary. |
| 1302 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1303 | = ==================================================================== |
| 1304 | 0 Kernel stack erasing is disabled, STACKLEAK_METRICS are not updated. |
| 1305 | 1 Kernel stack erasing is enabled (default), it is performed before |
| 1306 | returning to the userspace at the end of syscalls. |
| 1307 | = ==================================================================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1308 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1309 | |
| 1310 | stop-a (SPARC only) |
| 1311 | =================== |
| 1312 | |
| 1313 | Controls Stop-A: |
| 1314 | |
| 1315 | = ==================================== |
| 1316 | 0 Stop-A has no effect. |
| 1317 | 1 Stop-A breaks to the PROM (default). |
| 1318 | = ==================================== |
| 1319 | |
| 1320 | Stop-A is always enabled on a panic, so that the user can return to |
| 1321 | the boot PROM. |
| 1322 | |
| 1323 | |
| 1324 | sysrq |
| 1325 | ===== |
| 1326 | |
| 1327 | See :doc:`/admin-guide/sysrq`. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1328 | |
| 1329 | |
| 1330 | tainted |
| 1331 | ======= |
| 1332 | |
| 1333 | Non-zero if the kernel has been tainted. Numeric values, which can be |
| 1334 | ORed together. The letters are seen in "Tainted" line of Oops reports. |
| 1335 | |
| 1336 | ====== ===== ============================================================== |
| 1337 | 1 `(P)` proprietary module was loaded |
| 1338 | 2 `(F)` module was force loaded |
| 1339 | 4 `(S)` SMP kernel oops on an officially SMP incapable processor |
| 1340 | 8 `(R)` module was force unloaded |
| 1341 | 16 `(M)` processor reported a Machine Check Exception (MCE) |
| 1342 | 32 `(B)` bad page referenced or some unexpected page flags |
| 1343 | 64 `(U)` taint requested by userspace application |
| 1344 | 128 `(D)` kernel died recently, i.e. there was an OOPS or BUG |
| 1345 | 256 `(A)` an ACPI table was overridden by user |
| 1346 | 512 `(W)` kernel issued warning |
| 1347 | 1024 `(C)` staging driver was loaded |
| 1348 | 2048 `(I)` workaround for bug in platform firmware applied |
| 1349 | 4096 `(O)` externally-built ("out-of-tree") module was loaded |
| 1350 | 8192 `(E)` unsigned module was loaded |
| 1351 | 16384 `(L)` soft lockup occurred |
| 1352 | 32768 `(K)` kernel has been live patched |
| 1353 | 65536 `(X)` Auxiliary taint, defined and used by for distros |
| 1354 | 131072 `(T)` The kernel was built with the struct randomization plugin |
| 1355 | ====== ===== ============================================================== |
| 1356 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1357 | See :doc:`/admin-guide/tainted-kernels` for more information. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1358 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1359 | Note: |
| 1360 | writes to this sysctl interface will fail with ``EINVAL`` if the kernel is |
| 1361 | booted with the command line option ``panic_on_taint=<bitmask>,nousertaint`` |
| 1362 | and any of the ORed together values being written to ``tainted`` match with |
| 1363 | the bitmask declared on panic_on_taint. |
| 1364 | See :doc:`/admin-guide/kernel-parameters` for more details on that particular |
| 1365 | kernel command line option and its optional ``nousertaint`` switch. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1366 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1367 | threads-max |
| 1368 | =========== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1369 | |
| 1370 | This value controls the maximum number of threads that can be created |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1371 | using ``fork()``. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1372 | |
| 1373 | During initialization the kernel sets this value such that even if the |
| 1374 | maximum number of threads is created, the thread structures occupy only |
| 1375 | a part (1/8th) of the available RAM pages. |
| 1376 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1377 | The minimum value that can be written to ``threads-max`` is 1. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1378 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1379 | The maximum value that can be written to ``threads-max`` is given by the |
| 1380 | constant ``FUTEX_TID_MASK`` (0x3fffffff). |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1381 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1382 | If a value outside of this range is written to ``threads-max`` an |
| 1383 | ``EINVAL`` error occurs. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1384 | |
| 1385 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1386 | traceoff_on_warning |
| 1387 | =================== |
| 1388 | |
| 1389 | When set, disables tracing (see :doc:`/trace/ftrace`) when a |
| 1390 | ``WARN()`` is hit. |
| 1391 | |
| 1392 | |
| 1393 | tracepoint_printk |
| 1394 | ================= |
| 1395 | |
| 1396 | When tracepoints are sent to printk() (enabled by the ``tp_printk`` |
| 1397 | boot parameter), this entry provides runtime control:: |
| 1398 | |
| 1399 | echo 0 > /proc/sys/kernel/tracepoint_printk |
| 1400 | |
| 1401 | will stop tracepoints from being sent to printk(), and:: |
| 1402 | |
| 1403 | echo 1 > /proc/sys/kernel/tracepoint_printk |
| 1404 | |
| 1405 | will send them to printk() again. |
| 1406 | |
| 1407 | This only works if the kernel was booted with ``tp_printk`` enabled. |
| 1408 | |
| 1409 | See :doc:`/admin-guide/kernel-parameters` and |
| 1410 | :doc:`/trace/boottime-trace`. |
| 1411 | |
| 1412 | |
| 1413 | .. _unaligned-dump-stack: |
| 1414 | |
| 1415 | unaligned-dump-stack (ia64) |
| 1416 | =========================== |
| 1417 | |
| 1418 | When logging unaligned accesses, controls whether the stack is |
| 1419 | dumped. |
| 1420 | |
| 1421 | = =================================================== |
| 1422 | 0 Do not dump the stack. This is the default setting. |
| 1423 | 1 Dump the stack. |
| 1424 | = =================================================== |
| 1425 | |
| 1426 | See also `ignore-unaligned-usertrap`_. |
| 1427 | |
| 1428 | |
| 1429 | unaligned-trap |
| 1430 | ============== |
| 1431 | |
| 1432 | On architectures where unaligned accesses cause traps, and where this |
| 1433 | feature is supported (``CONFIG_SYSCTL_ARCH_UNALIGN_ALLOW``; currently, |
| 1434 | ``arc`` and ``parisc``), controls whether unaligned traps are caught |
| 1435 | and emulated (instead of failing). |
| 1436 | |
| 1437 | = ======================================================== |
| 1438 | 0 Do not emulate unaligned accesses. |
| 1439 | 1 Emulate unaligned accesses. This is the default setting. |
| 1440 | = ======================================================== |
| 1441 | |
| 1442 | See also `ignore-unaligned-usertrap`_. |
| 1443 | |
| 1444 | |
| 1445 | unknown_nmi_panic |
| 1446 | ================= |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1447 | |
| 1448 | The value in this file affects behavior of handling NMI. When the |
| 1449 | value is non-zero, unknown NMI is trapped and then panic occurs. At |
| 1450 | that time, kernel debugging information is displayed on console. |
| 1451 | |
| 1452 | NMI switch that most IA32 servers have fires unknown NMI up, for |
| 1453 | example. If a system hangs up, try pressing the NMI switch. |
| 1454 | |
| 1455 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1456 | unprivileged_bpf_disabled |
| 1457 | ========================= |
| 1458 | |
| 1459 | Writing 1 to this entry will disable unprivileged calls to ``bpf()``; |
| 1460 | once disabled, calling ``bpf()`` without ``CAP_SYS_ADMIN`` or ``CAP_BPF`` |
| 1461 | will return ``-EPERM``. Once set to 1, this can't be cleared from the |
| 1462 | running kernel anymore. |
| 1463 | |
| 1464 | Writing 2 to this entry will also disable unprivileged calls to ``bpf()``, |
| 1465 | however, an admin can still change this setting later on, if needed, by |
| 1466 | writing 0 or 1 to this entry. |
| 1467 | |
| 1468 | If ``BPF_UNPRIV_DEFAULT_OFF`` is enabled in the kernel config, then this |
| 1469 | entry will default to 2 instead of 0. |
| 1470 | |
| 1471 | = ============================================================= |
| 1472 | 0 Unprivileged calls to ``bpf()`` are enabled |
| 1473 | 1 Unprivileged calls to ``bpf()`` are disabled without recovery |
| 1474 | 2 Unprivileged calls to ``bpf()`` are disabled |
| 1475 | = ============================================================= |
| 1476 | |
| 1477 | watchdog |
| 1478 | ======== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1479 | |
| 1480 | This parameter can be used to disable or enable the soft lockup detector |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1481 | *and* the NMI watchdog (i.e. the hard lockup detector) at the same time. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1482 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1483 | = ============================== |
| 1484 | 0 Disable both lockup detectors. |
| 1485 | 1 Enable both lockup detectors. |
| 1486 | = ============================== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1487 | |
| 1488 | The soft lockup detector and the NMI watchdog can also be disabled or |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1489 | enabled individually, using the ``soft_watchdog`` and ``nmi_watchdog`` |
| 1490 | parameters. |
| 1491 | If the ``watchdog`` parameter is read, for example by executing:: |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1492 | |
| 1493 | cat /proc/sys/kernel/watchdog |
| 1494 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1495 | the output of this command (0 or 1) shows the logical OR of |
| 1496 | ``soft_watchdog`` and ``nmi_watchdog``. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1497 | |
| 1498 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1499 | watchdog_cpumask |
| 1500 | ================ |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1501 | |
| 1502 | This value can be used to control on which cpus the watchdog may run. |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1503 | The default cpumask is all possible cores, but if ``NO_HZ_FULL`` is |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1504 | enabled in the kernel config, and cores are specified with the |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1505 | ``nohz_full=`` boot argument, those cores are excluded by default. |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1506 | Offline cores can be included in this mask, and if the core is later |
| 1507 | brought online, the watchdog will be started based on the mask value. |
| 1508 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1509 | Typically this value would only be touched in the ``nohz_full`` case |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1510 | to re-enable cores that by default were not running the watchdog, |
| 1511 | if a kernel lockup was suspected on those cores. |
| 1512 | |
| 1513 | The argument value is the standard cpulist format for cpumasks, |
| 1514 | so for example to enable the watchdog on cores 0, 2, 3, and 4 you |
| 1515 | might say:: |
| 1516 | |
| 1517 | echo 0,2-4 > /proc/sys/kernel/watchdog_cpumask |
| 1518 | |
| 1519 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1520 | watchdog_thresh |
| 1521 | =============== |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1522 | |
| 1523 | This value can be used to control the frequency of hrtimer and NMI |
| 1524 | events and the soft and hard lockup thresholds. The default threshold |
| 1525 | is 10 seconds. |
| 1526 | |
Olivier Deprez | 157378f | 2022-04-04 15:47:50 +0200 | [diff] [blame^] | 1527 | The softlockup threshold is (``2 * watchdog_thresh``). Setting this |
David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame] | 1528 | tunable to zero will disable lockup detection altogether. |