Blame - Documentation/process/coding-style.rst - hafnium/third_party/linux

blob: 98227226c4e5f24dfb45550d3ccb718fa7e2208e [file] [log] [blame]

Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	1	.. _codingstyle:
				2
				3	Linux kernel coding style
				4	=========================
				5
				6	This is a short document describing the preferred coding style for the
				7	linux kernel. Coding style is very personal, and I won't force my
				8	views on anybody, but this is what goes for anything that I have to be
				9	able to maintain, and I'd prefer it for most other things too. Please
				10	at least consider the points made here.
				11
				12	First off, I'd suggest printing out a copy of the GNU coding standards,
				13	and NOT read it. Burn them, it's a great symbolic gesture.
				14
				15	Anyway, here goes:
				16
				17
				18	1) Indentation
				19	--------------
				20
				21	Tabs are 8 characters, and thus indentations are also 8 characters.
				22	There are heretic movements that try to make indentations 4 (or even 2!)
				23	characters deep, and that is akin to trying to define the value of PI to
				24	be 3.
				25
				26	Rationale: The whole idea behind indentation is to clearly define where
				27	a block of control starts and ends. Especially when you've been looking
				28	at your screen for 20 straight hours, you'll find it a lot easier to see
				29	how the indentation works if you have large indentations.
				30
				31	Now, some people will claim that having 8-character indentations makes
				32	the code move too far to the right, and makes it hard to read on a
				33	80-character terminal screen. The answer to that is that if you need
				34	more than 3 levels of indentation, you're screwed anyway, and should fix
				35	your program.
				36
				37	In short, 8-char indents make things easier to read, and have the added
				38	benefit of warning you when you're nesting your functions too deep.
				39	Heed that warning.
				40
				41	The preferred way to ease multiple indentation levels in a switch statement is
				42	to align the ``switch`` and its subordinate ``case`` labels in the same column
				43	instead of ``double-indenting`` the ``case`` labels. E.g.:
				44
				45	.. code-block:: c
				46
				47	switch (suffix) {
				48	case 'G':
				49	case 'g':
				50	mem <<= 30;
				51	break;
				52	case 'M':
				53	case 'm':
				54	mem <<= 20;
				55	break;
				56	case 'K':
				57	case 'k':
				58	mem <<= 10;
David Brazdil	0f672f6	2019-12-10 10:32:29 +0000	[diff] [blame]	59	fallthrough;
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	60	default:
				61	break;
				62	}
				63
				64	Don't put multiple statements on a single line unless you have
				65	something to hide:
				66
				67	.. code-block:: c
				68
				69	if (condition) do_this;
				70	do_something_everytime;
				71
				72	Don't put multiple assignments on a single line either. Kernel coding style
				73	is super simple. Avoid tricky expressions.
				74
				75	Outside of comments, documentation and except in Kconfig, spaces are never
				76	used for indentation, and the above example is deliberately broken.
				77
				78	Get a decent editor and don't leave whitespace at the end of lines.
				79
				80
				81	2) Breaking long lines and strings
				82	----------------------------------
				83
				84	Coding style is all about readability and maintainability using commonly
				85	available tools.
				86
Olivier Deprez	157378f	2022-04-04 15:47:50 +0200	[diff] [blame^]	87	The preferred limit on the length of a single line is 80 columns.
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	88
Olivier Deprez	157378f	2022-04-04 15:47:50 +0200	[diff] [blame^]	89	Statements longer than 80 columns should be broken into sensible chunks,
				90	unless exceeding 80 columns significantly increases readability and does
				91	not hide information.
				92
				93	Descendants are always substantially shorter than the parent and
				94	are placed substantially to the right. A very commonly used style
				95	is to align descendants to a function open parenthesis.
				96
				97	These same rules are applied to function headers with a long argument list.
				98
				99	However, never break user-visible strings such as printk messages because
				100	that breaks the ability to grep for them.
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	101
				102
				103	3) Placing Braces and Spaces
				104	----------------------------
				105
				106	The other issue that always comes up in C styling is the placement of
				107	braces. Unlike the indent size, there are few technical reasons to
				108	choose one placement strategy over the other, but the preferred way, as
				109	shown to us by the prophets Kernighan and Ritchie, is to put the opening
				110	brace last on the line, and put the closing brace first, thusly:
				111
				112	.. code-block:: c
				113
				114	if (x is true) {
				115	we do y
				116	}
				117
				118	This applies to all non-function statement blocks (if, switch, for,
				119	while, do). E.g.:
				120
				121	.. code-block:: c
				122
				123	switch (action) {
				124	case KOBJ_ADD:
				125	return "add";
				126	case KOBJ_REMOVE:
				127	return "remove";
				128	case KOBJ_CHANGE:
				129	return "change";
				130	default:
				131	return NULL;
				132	}
				133
				134	However, there is one special case, namely functions: they have the
				135	opening brace at the beginning of the next line, thus:
				136
				137	.. code-block:: c
				138
				139	int function(int x)
				140	{
				141	body of function
				142	}
				143
				144	Heretic people all over the world have claimed that this inconsistency
				145	is ... well ... inconsistent, but all right-thinking people know that
				146	(a) K&R are right and (b) K&R are right. Besides, functions are
				147	special anyway (you can't nest them in C).
				148
				149	Note that the closing brace is empty on a line of its own, except in
				150	the cases where it is followed by a continuation of the same statement,
				151	ie a ``while`` in a do-statement or an ``else`` in an if-statement, like
				152	this:
				153
				154	.. code-block:: c
				155
				156	do {
				157	body of do-loop
				158	} while (condition);
				159
				160	and
				161
				162	.. code-block:: c
				163
				164	if (x == y) {
				165	..
				166	} else if (x > y) {
				167	...
				168	} else {
				169	....
				170	}
				171
				172	Rationale: K&R.
				173
				174	Also, note that this brace-placement also minimizes the number of empty
				175	(or almost empty) lines, without any loss of readability. Thus, as the
				176	supply of new-lines on your screen is not a renewable resource (think
				177	25-line terminal screens here), you have more empty lines to put
				178	comments on.
				179
				180	Do not unnecessarily use braces where a single statement will do.
				181
				182	.. code-block:: c
				183
				184	if (condition)
				185	action();
				186
				187	and
				188
				189	.. code-block:: none
				190
				191	if (condition)
				192	do_this();
				193	else
				194	do_that();
				195
				196	This does not apply if only one branch of a conditional statement is a single
				197	statement; in the latter case use braces in both branches:
				198
				199	.. code-block:: c
				200
				201	if (condition) {
				202	do_this();
				203	do_that();
				204	} else {
				205	otherwise();
				206	}
				207
				208	Also, use braces when a loop contains more than a single simple statement:
				209
				210	.. code-block:: c
				211
				212	while (condition) {
				213	if (test)
				214	do_something();
				215	}
				216
				217	3.1) Spaces
				218	***********
				219
				220	Linux kernel style for use of spaces depends (mostly) on
				221	function-versus-keyword usage. Use a space after (most) keywords. The
				222	notable exceptions are sizeof, typeof, alignof, and __attribute__, which look
				223	somewhat like functions (and are usually used with parentheses in Linux,
				224	although they are not required in the language, as in: ``sizeof info`` after
				225	``struct fileinfo info;`` is declared).
				226
				227	So use a space after these keywords::
				228
				229	if, switch, case, for, do, while
				230
				231	but not with sizeof, typeof, alignof, or __attribute__. E.g.,
				232
				233	.. code-block:: c
				234
				235
				236	s = sizeof(struct file);
				237
				238	Do not add spaces around (inside) parenthesized expressions. This example is
				239	bad:
				240
				241	.. code-block:: c
				242
				243
				244	s = sizeof( struct file );
				245
				246	When declaring pointer data or a function that returns a pointer type, the
				247	preferred use of ``*`` is adjacent to the data name or function name and not
				248	adjacent to the type name. Examples:
				249
				250	.. code-block:: c
				251
				252
				253	char *linux_banner;
				254	unsigned long long memparse(char ptr, char *retptr);
				255	char match_strdup(substring_t s);
				256
				257	Use one space around (on each side of) most binary and ternary operators,
				258	such as any of these::
				259
				260	= + - < > * / % \| & ^ <= >= == != ? :
				261
				262	but no space after unary operators::
				263
				264	& * + - ~ ! sizeof typeof alignof __attribute__ defined
				265
				266	no space before the postfix increment & decrement unary operators::
				267
				268	++ --
				269
				270	no space after the prefix increment & decrement unary operators::
				271
				272	++ --
				273
				274	and no space around the ``.`` and ``->`` structure member operators.
				275
				276	Do not leave trailing whitespace at the ends of lines. Some editors with
				277	``smart`` indentation will insert whitespace at the beginning of new lines as
				278	appropriate, so you can start typing the next line of code right away.
				279	However, some such editors do not remove the whitespace if you end up not
				280	putting a line of code there, such as if you leave a blank line. As a result,
				281	you end up with lines containing trailing whitespace.
				282
				283	Git will warn you about patches that introduce trailing whitespace, and can
				284	optionally strip the trailing whitespace for you; however, if applying a series
				285	of patches, this may make later patches in the series fail by changing their
				286	context lines.
				287
				288
				289	4) Naming
				290	---------
				291
Olivier Deprez	157378f	2022-04-04 15:47:50 +0200	[diff] [blame^]	292	C is a Spartan language, and your naming conventions should follow suit.
				293	Unlike Modula-2 and Pascal programmers, C programmers do not use cute
				294	names like ThisVariableIsATemporaryCounter. A C programmer would call that
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	295	variable ``tmp``, which is much easier to write, and not the least more
				296	difficult to understand.
				297
				298	HOWEVER, while mixed-case names are frowned upon, descriptive names for
				299	global variables are a must. To call a global function ``foo`` is a
				300	shooting offense.
				301
				302	GLOBAL variables (to be used only if you really need them) need to
				303	have descriptive names, as do global functions. If you have a function
				304	that counts the number of active users, you should call that
				305	``count_active_users()`` or similar, you should not call it ``cntusr()``.
				306
				307	Encoding the type of a function into the name (so-called Hungarian
Olivier Deprez	157378f	2022-04-04 15:47:50 +0200	[diff] [blame^]	308	notation) is asinine - the compiler knows the types anyway and can check
				309	those, and it only confuses the programmer. No wonder Microsoft makes buggy
				310	programs.
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	311
				312	LOCAL variable names should be short, and to the point. If you have
				313	some random integer loop counter, it should probably be called ``i``.
				314	Calling it ``loop_counter`` is non-productive, if there is no chance of it
				315	being mis-understood. Similarly, ``tmp`` can be just about any type of
				316	variable that is used to hold a temporary value.
				317
				318	If you are afraid to mix up your local variable names, you have another
				319	problem, which is called the function-growth-hormone-imbalance syndrome.
				320	See chapter 6 (Functions).
				321
Olivier Deprez	157378f	2022-04-04 15:47:50 +0200	[diff] [blame^]	322	For symbol names and documentation, avoid introducing new usage of
				323	'master / slave' (or 'slave' independent of 'master') and 'blacklist /
				324	whitelist'.
				325
				326	Recommended replacements for 'master / slave' are:
				327	'{primary,main} / {secondary,replica,subordinate}'
				328	'{initiator,requester} / {target,responder}'
				329	'{controller,host} / {device,worker,proxy}'
				330	'leader / follower'
				331	'director / performer'
				332
				333	Recommended replacements for 'blacklist/whitelist' are:
				334	'denylist / allowlist'
				335	'blocklist / passlist'
				336
				337	Exceptions for introducing new usage is to maintain a userspace ABI/API,
				338	or when updating code for an existing (as of 2020) hardware or protocol
				339	specification that mandates those terms. For new specifications
				340	translate specification usage of the terminology to the kernel coding
				341	standard where possible.
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	342
				343	5) Typedefs
				344	-----------
				345
				346	Please don't use things like ``vps_t``.
				347	It's a mistake to use typedef for structures and pointers. When you see a
				348
				349	.. code-block:: c
				350
				351
				352	vps_t a;
				353
				354	in the source, what does it mean?
				355	In contrast, if it says
				356
				357	.. code-block:: c
				358
				359	struct virtual_container *a;
				360
				361	you can actually tell what ``a`` is.
				362
				363	Lots of people think that typedefs ``help readability``. Not so. They are
				364	useful only for:
				365
				366	(a) totally opaque objects (where the typedef is actively used to hide
				367	what the object is).
				368
				369	Example: ``pte_t`` etc. opaque objects that you can only access using
				370	the proper accessor functions.
				371
				372	.. note::
				373
				374	Opaqueness and ``accessor functions`` are not good in themselves.
				375	The reason we have them for things like pte_t etc. is that there
				376	really is absolutely zero portably accessible information there.
				377
				378	(b) Clear integer types, where the abstraction helps avoid confusion
				379	whether it is ``int`` or ``long``.
				380
				381	u8/u16/u32 are perfectly fine typedefs, although they fit into
				382	category (d) better than here.
				383
				384	.. note::
				385
				386	Again - there needs to be a reason for this. If something is
				387	``unsigned long``, then there's no reason to do
				388
				389	typedef unsigned long myflags_t;
				390
				391	but if there is a clear reason for why it under certain circumstances
				392	might be an ``unsigned int`` and under other configurations might be
				393	``unsigned long``, then by all means go ahead and use a typedef.
				394
				395	(c) when you use sparse to literally create a new type for
				396	type-checking.
				397
				398	(d) New types which are identical to standard C99 types, in certain
				399	exceptional circumstances.
				400
				401	Although it would only take a short amount of time for the eyes and
				402	brain to become accustomed to the standard types like ``uint32_t``,
				403	some people object to their use anyway.
				404
				405	Therefore, the Linux-specific ``u8/u16/u32/u64`` types and their
				406	signed equivalents which are identical to standard types are
				407	permitted -- although they are not mandatory in new code of your
				408	own.
				409
				410	When editing existing code which already uses one or the other set
				411	of types, you should conform to the existing choices in that code.
				412
				413	(e) Types safe for use in userspace.
				414
				415	In certain structures which are visible to userspace, we cannot
				416	require C99 types and cannot use the ``u32`` form above. Thus, we
				417	use __u32 and similar types in all structures which are shared
				418	with userspace.
				419
				420	Maybe there are other cases too, but the rule should basically be to NEVER
				421	EVER use a typedef unless you can clearly match one of those rules.
				422
				423	In general, a pointer, or a struct that has elements that can reasonably
				424	be directly accessed should never be a typedef.
				425
				426
				427	6) Functions
				428	------------
				429
				430	Functions should be short and sweet, and do just one thing. They should
				431	fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24,
				432	as we all know), and do one thing and do that well.
				433
				434	The maximum length of a function is inversely proportional to the
				435	complexity and indentation level of that function. So, if you have a
				436	conceptually simple function that is just one long (but simple)
				437	case-statement, where you have to do lots of small things for a lot of
				438	different cases, it's OK to have a longer function.
				439
				440	However, if you have a complex function, and you suspect that a
				441	less-than-gifted first-year high-school student might not even
				442	understand what the function is all about, you should adhere to the
				443	maximum limits all the more closely. Use helper functions with
				444	descriptive names (you can ask the compiler to in-line them if you think
				445	it's performance-critical, and it will probably do a better job of it
				446	than you would have done).
				447
				448	Another measure of the function is the number of local variables. They
				449	shouldn't exceed 5-10, or you're doing something wrong. Re-think the
				450	function, and split it into smaller pieces. A human brain can
				451	generally easily keep track of about 7 different things, anything more
				452	and it gets confused. You know you're brilliant, but maybe you'd like
				453	to understand what you did 2 weeks from now.
				454
				455	In source files, separate functions with one blank line. If the function is
				456	exported, the EXPORT macro for it should follow immediately after the
				457	closing function brace line. E.g.:
				458
				459	.. code-block:: c
				460
				461	int system_is_up(void)
				462	{
				463	return system_state == SYSTEM_RUNNING;
				464	}
				465	EXPORT_SYMBOL(system_is_up);
				466
				467	In function prototypes, include parameter names with their data types.
				468	Although this is not required by the C language, it is preferred in Linux
				469	because it is a simple way to add valuable information for the reader.
				470
David Brazdil	0f672f6	2019-12-10 10:32:29 +0000	[diff] [blame]	471	Do not use the ``extern`` keyword with function prototypes as this makes
				472	lines longer and isn't strictly necessary.
				473
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	474
				475	7) Centralized exiting of functions
				476	-----------------------------------
				477
				478	Albeit deprecated by some people, the equivalent of the goto statement is
				479	used frequently by compilers in form of the unconditional jump instruction.
				480
				481	The goto statement comes in handy when a function exits from multiple
				482	locations and some common work such as cleanup has to be done. If there is no
				483	cleanup needed then just return directly.
				484
				485	Choose label names which say what the goto does or why the goto exists. An
				486	example of a good name could be ``out_free_buffer:`` if the goto frees ``buffer``.
				487	Avoid using GW-BASIC names like ``err1:`` and ``err2:``, as you would have to
				488	renumber them if you ever add or remove exit paths, and they make correctness
				489	difficult to verify anyway.
				490
				491	The rationale for using gotos is:
				492
				493	- unconditional statements are easier to understand and follow
				494	- nesting is reduced
				495	- errors by not updating individual exit points when making
				496	modifications are prevented
				497	- saves the compiler work to optimize redundant code away ;)
				498
				499	.. code-block:: c
				500
				501	int fun(int a)
				502	{
				503	int result = 0;
				504	char *buffer;
				505
				506	buffer = kmalloc(SIZE, GFP_KERNEL);
				507	if (!buffer)
				508	return -ENOMEM;
				509
				510	if (condition1) {
				511	while (loop1) {
				512	...
				513	}
				514	result = 1;
				515	goto out_free_buffer;
				516	}
				517	...
				518	out_free_buffer:
				519	kfree(buffer);
				520	return result;
				521	}
				522
				523	A common type of bug to be aware of is ``one err bugs`` which look like this:
				524
				525	.. code-block:: c
				526
				527	err:
				528	kfree(foo->bar);
				529	kfree(foo);
				530	return ret;
				531
				532	The bug in this code is that on some exit paths ``foo`` is NULL. Normally the
				533	fix for this is to split it up into two error labels ``err_free_bar:`` and
				534	``err_free_foo:``:
				535
				536	.. code-block:: c
				537
				538	err_free_bar:
				539	kfree(foo->bar);
				540	err_free_foo:
				541	kfree(foo);
				542	return ret;
				543
				544	Ideally you should simulate errors to test all exit paths.
				545
				546
				547	8) Commenting
				548	-------------
				549
				550	Comments are good, but there is also a danger of over-commenting. NEVER
				551	try to explain HOW your code works in a comment: it's much better to
				552	write the code so that the working is obvious, and it's a waste of
				553	time to explain badly written code.
				554
				555	Generally, you want your comments to tell WHAT your code does, not HOW.
				556	Also, try to avoid putting comments inside a function body: if the
				557	function is so complex that you need to separately comment parts of it,
				558	you should probably go back to chapter 6 for a while. You can make
				559	small comments to note or warn about something particularly clever (or
				560	ugly), but try to avoid excess. Instead, put the comments at the head
				561	of the function, telling people what it does, and possibly WHY it does
				562	it.
				563
				564	When commenting the kernel API functions, please use the kernel-doc format.
				565	See the files at :ref:`Documentation/doc-guide/ <doc_guide>` and
				566	``scripts/kernel-doc`` for details.
				567
				568	The preferred style for long (multi-line) comments is:
				569
				570	.. code-block:: c
				571
				572	/*
				573	* This is the preferred style for multi-line
				574	* comments in the Linux kernel source code.
				575	* Please use it consistently.
				576	*
				577	* Description: A column of asterisks on the left side,
				578	* with beginning and ending almost-blank lines.
				579	*/
				580
				581	For files in net/ and drivers/net/ the preferred style for long (multi-line)
				582	comments is a little different.
				583
				584	.. code-block:: c
				585
				586	/* The preferred comment style for files in net/ and drivers/net
				587	* looks like this.
				588	*
				589	* It is nearly the same as the generally preferred comment style,
				590	* but there is no initial almost-blank line.
				591	*/
				592
				593	It's also important to comment data, whether they are basic types or derived
				594	types. To this end, use just one data declaration per line (no commas for
				595	multiple data declarations). This leaves you room for a small comment on each
				596	item, explaining its use.
				597
				598
				599	9) You've made a mess of it
				600	---------------------------
				601
				602	That's OK, we all do. You've probably been told by your long-time Unix
				603	user helper that ``GNU emacs`` automatically formats the C sources for
				604	you, and you've noticed that yes, it does do that, but the defaults it
				605	uses are less than desirable (in fact, they are worse than random
				606	typing - an infinite number of monkeys typing into GNU emacs would never
				607	make a good program).
				608
				609	So, you can either get rid of GNU emacs, or change it to use saner
				610	values. To do the latter, you can stick the following in your .emacs file:
				611
				612	.. code-block:: none
				613
				614	(defun c-lineup-arglist-tabs-only (ignored)
				615	"Line up argument lists by tabs, not spaces"
				616	(let* ((anchor (c-langelem-pos c-syntactic-element))
				617	(column (c-langelem-2nd-pos c-syntactic-element))
				618	(offset (- (1+ column) anchor))
				619	(steps (floor offset c-basic-offset)))
				620	(* (max steps 1)
				621	c-basic-offset)))
				622
David Brazdil	0f672f6	2019-12-10 10:32:29 +0000	[diff] [blame]	623	(dir-locals-set-class-variables
				624	'linux-kernel
				625	'((c-mode . (
				626	(c-basic-offset . 8)
				627	(c-label-minimum-indentation . 0)
				628	(c-offsets-alist . (
				629	(arglist-close . c-lineup-arglist-tabs-only)
				630	(arglist-cont-nonempty .
				631	(c-lineup-gcc-asm-reg c-lineup-arglist-tabs-only))
				632	(arglist-intro . +)
				633	(brace-list-intro . +)
				634	(c . c-lineup-C-comments)
				635	(case-label . 0)
				636	(comment-intro . c-lineup-comment)
				637	(cpp-define-intro . +)
				638	(cpp-macro . -1000)
				639	(cpp-macro-cont . +)
				640	(defun-block-intro . +)
				641	(else-clause . 0)
				642	(func-decl-cont . +)
				643	(inclass . +)
				644	(inher-cont . c-lineup-multi-inher)
				645	(knr-argdecl-intro . 0)
				646	(label . -1000)
				647	(statement . 0)
				648	(statement-block-intro . +)
				649	(statement-case-intro . +)
				650	(statement-cont . +)
				651	(substatement . +)
				652	))
				653	(indent-tabs-mode . t)
				654	(show-trailing-whitespace . t)
				655	))))
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	656
David Brazdil	0f672f6	2019-12-10 10:32:29 +0000	[diff] [blame]	657	(dir-locals-set-directory-class
				658	(expand-file-name "~/src/linux-trees")
				659	'linux-kernel)
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	660
				661	This will make emacs go better with the kernel coding style for C
				662	files below ``~/src/linux-trees``.
				663
				664	But even if you fail in getting emacs to do sane formatting, not
				665	everything is lost: use ``indent``.
				666
				667	Now, again, GNU indent has the same brain-dead settings that GNU emacs
				668	has, which is why you need to give it a few command line options.
				669	However, that's not too bad, because even the makers of GNU indent
				670	recognize the authority of K&R (the GNU people aren't evil, they are
				671	just severely misguided in this matter), so you just give indent the
				672	options ``-kr -i8`` (stands for ``K&R, 8 character indents``), or use
				673	``scripts/Lindent``, which indents in the latest style.
				674
				675	``indent`` has a lot of options, and especially when it comes to comment
				676	re-formatting you may want to take a look at the man page. But
				677	remember: ``indent`` is not a fix for bad programming.
				678
				679	Note that you can also use the ``clang-format`` tool to help you with
				680	these rules, to quickly re-format parts of your code automatically,
				681	and to review full files in order to spot coding style mistakes,
				682	typos and possible improvements. It is also handy for sorting ``#includes``,
				683	for aligning variables/macros, for reflowing text and other similar tasks.
				684	See the file :ref:`Documentation/process/clang-format.rst <clangformat>`
				685	for more details.
				686
				687
				688	10) Kconfig configuration files
				689	-------------------------------
				690
				691	For all of the Kconfig* configuration files throughout the source tree,
				692	the indentation is somewhat different. Lines under a ``config`` definition
				693	are indented with one tab, while help text is indented an additional two
				694	spaces. Example::
				695
				696	config AUDIT
				697	bool "Auditing support"
				698	depends on NET
				699	help
				700	Enable auditing infrastructure that can be used with another
				701	kernel subsystem, such as SELinux (which requires this for
				702	logging of avc messages output). Does not do system-call
				703	auditing without CONFIG_AUDITSYSCALL.
				704
				705	Seriously dangerous features (such as write support for certain
				706	filesystems) should advertise this prominently in their prompt string::
				707
				708	config ADFS_FS_RW
				709	bool "ADFS write support (DANGEROUS)"
				710	depends on ADFS_FS
				711	...
				712
				713	For full documentation on the configuration files, see the file
David Brazdil	0f672f6	2019-12-10 10:32:29 +0000	[diff] [blame]	714	Documentation/kbuild/kconfig-language.rst.
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	715
				716
				717	11) Data structures
				718	-------------------
				719
				720	Data structures that have visibility outside the single-threaded
				721	environment they are created and destroyed in should always have
				722	reference counts. In the kernel, garbage collection doesn't exist (and
				723	outside the kernel garbage collection is slow and inefficient), which
				724	means that you absolutely have to reference count all your uses.
				725
				726	Reference counting means that you can avoid locking, and allows multiple
				727	users to have access to the data structure in parallel - and not having
				728	to worry about the structure suddenly going away from under them just
				729	because they slept or did something else for a while.
				730
				731	Note that locking is not a replacement for reference counting.
				732	Locking is used to keep data structures coherent, while reference
				733	counting is a memory management technique. Usually both are needed, and
				734	they are not to be confused with each other.
				735
				736	Many data structures can indeed have two levels of reference counting,
				737	when there are users of different ``classes``. The subclass count counts
				738	the number of subclass users, and decrements the global count just once
				739	when the subclass count goes to zero.
				740
				741	Examples of this kind of ``multi-level-reference-counting`` can be found in
				742	memory management (``struct mm_struct``: mm_users and mm_count), and in
				743	filesystem code (``struct super_block``: s_count and s_active).
				744
				745	Remember: if another thread can find your data structure, and you don't
				746	have a reference count on it, you almost certainly have a bug.
				747
				748
				749	12) Macros, Enums and RTL
				750	-------------------------
				751
				752	Names of macros defining constants and labels in enums are capitalized.
				753
				754	.. code-block:: c
				755
				756	#define CONSTANT 0x12345
				757
				758	Enums are preferred when defining several related constants.
				759
				760	CAPITALIZED macro names are appreciated but macros resembling functions
				761	may be named in lower case.
				762
				763	Generally, inline functions are preferable to macros resembling functions.
				764
				765	Macros with multiple statements should be enclosed in a do - while block:
				766
				767	.. code-block:: c
				768
				769	#define macrofun(a, b, c) \
				770	do { \
				771	if (a == 5) \
				772	do_this(b, c); \
				773	} while (0)
				774
				775	Things to avoid when using macros:
				776
				777	1) macros that affect control flow:
				778
				779	.. code-block:: c
				780
				781	#define FOO(x) \
				782	do { \
				783	if (blah(x) < 0) \
				784	return -EBUGGERED; \
				785	} while (0)
				786
				787	is a very bad idea. It looks like a function call but exits the ``calling``
				788	function; don't break the internal parsers of those who will read the code.
				789
				790	2) macros that depend on having a local variable with a magic name:
				791
				792	.. code-block:: c
				793
				794	#define FOO(val) bar(index, val)
				795
				796	might look like a good thing, but it's confusing as hell when one reads the
				797	code and it's prone to breakage from seemingly innocent changes.
				798
				799	3) macros with arguments that are used as l-values: FOO(x) = y; will
				800	bite you if somebody e.g. turns FOO into an inline function.
				801
				802	4) forgetting about precedence: macros defining constants using expressions
				803	must enclose the expression in parentheses. Beware of similar issues with
				804	macros using parameters.
				805
				806	.. code-block:: c
				807
				808	#define CONSTANT 0x4000
				809	#define CONSTEXP (CONSTANT \| 3)
				810
				811	5) namespace collisions when defining local variables in macros resembling
				812	functions:
				813
				814	.. code-block:: c
				815
				816	#define FOO(x) \
				817	({ \
				818	typeof(x) ret; \
				819	ret = calc_ret(x); \
				820	(ret); \
				821	})
				822
				823	ret is a common name for a local variable - __foo_ret is less likely
				824	to collide with an existing variable.
				825
				826	The cpp manual deals with macros exhaustively. The gcc internals manual also
				827	covers RTL which is used frequently with assembly language in the kernel.
				828
				829
				830	13) Printing kernel messages
				831	----------------------------
				832
				833	Kernel developers like to be seen as literate. Do mind the spelling
Olivier Deprez	157378f	2022-04-04 15:47:50 +0200	[diff] [blame^]	834	of kernel messages to make a good impression. Do not use incorrect
				835	contractions like ``dont``; use ``do not`` or ``don't`` instead. Make the
				836	messages concise, clear, and unambiguous.
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	837
				838	Kernel messages do not have to be terminated with a period.
				839
				840	Printing numbers in parentheses (%d) adds no value and should be avoided.
				841
				842	There are a number of driver model diagnostic macros in <linux/device.h>
				843	which you should use to make sure messages are matched to the right device
				844	and driver, and are tagged with the right level: dev_err(), dev_warn(),
				845	dev_info(), and so forth. For messages that aren't associated with a
				846	particular device, <linux/printk.h> defines pr_notice(), pr_info(),
				847	pr_warn(), pr_err(), etc.
				848
				849	Coming up with good debugging messages can be quite a challenge; and once
				850	you have them, they can be a huge help for remote troubleshooting. However
				851	debug message printing is handled differently than printing other non-debug
				852	messages. While the other pr_XXX() functions print unconditionally,
				853	pr_debug() does not; it is compiled out by default, unless either DEBUG is
				854	defined or CONFIG_DYNAMIC_DEBUG is set. That is true for dev_dbg() also,
				855	and a related convention uses VERBOSE_DEBUG to add dev_vdbg() messages to
				856	the ones already enabled by DEBUG.
				857
				858	Many subsystems have Kconfig debug options to turn on -DDEBUG in the
				859	corresponding Makefile; in other cases specific files #define DEBUG. And
				860	when a debug message should be unconditionally printed, such as if it is
				861	already inside a debug-related #ifdef section, printk(KERN_DEBUG ...) can be
				862	used.
				863
				864
				865	14) Allocating memory
				866	---------------------
				867
				868	The kernel provides the following general purpose memory allocators:
				869	kmalloc(), kzalloc(), kmalloc_array(), kcalloc(), vmalloc(), and
				870	vzalloc(). Please refer to the API documentation for further information
David Brazdil	0f672f6	2019-12-10 10:32:29 +0000	[diff] [blame]	871	about them. :ref:`Documentation/core-api/memory-allocation.rst
				872	<memory_allocation>`
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	873
				874	The preferred form for passing a size of a struct is the following:
				875
				876	.. code-block:: c
				877
				878	p = kmalloc(sizeof(*p), ...);
				879
				880	The alternative form where struct name is spelled out hurts readability and
				881	introduces an opportunity for a bug when the pointer variable type is changed
				882	but the corresponding sizeof that is passed to a memory allocator is not.
				883
				884	Casting the return value which is a void pointer is redundant. The conversion
				885	from void pointer to any other pointer type is guaranteed by the C programming
				886	language.
				887
				888	The preferred form for allocating an array is the following:
				889
				890	.. code-block:: c
				891
				892	p = kmalloc_array(n, sizeof(...), ...);
				893
				894	The preferred form for allocating a zeroed array is the following:
				895
				896	.. code-block:: c
				897
				898	p = kcalloc(n, sizeof(...), ...);
				899
				900	Both forms check for overflow on the allocation size n * sizeof(...),
				901	and return NULL if that occurred.
				902
David Brazdil	0f672f6	2019-12-10 10:32:29 +0000	[diff] [blame]	903	These generic allocation functions all emit a stack dump on failure when used
				904	without __GFP_NOWARN so there is no use in emitting an additional failure
				905	message when NULL is returned.
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	906
				907	15) The inline disease
				908	----------------------
				909
				910	There appears to be a common misperception that gcc has a magic "make me
				911	faster" speedup option called ``inline``. While the use of inlines can be
				912	appropriate (for example as a means of replacing macros, see Chapter 12), it
				913	very often is not. Abundant use of the inline keyword leads to a much bigger
				914	kernel, which in turn slows the system as a whole down, due to a bigger
				915	icache footprint for the CPU and simply because there is less memory
				916	available for the pagecache. Just think about it; a pagecache miss causes a
				917	disk seek, which easily takes 5 milliseconds. There are a LOT of cpu cycles
				918	that can go into these 5 milliseconds.
				919
				920	A reasonable rule of thumb is to not put inline at functions that have more
				921	than 3 lines of code in them. An exception to this rule are the cases where
				922	a parameter is known to be a compiletime constant, and as a result of this
				923	constantness you know the compiler will be able to optimize most of your
				924	function away at compile time. For a good example of this later case, see
				925	the kmalloc() inline function.
				926
				927	Often people argue that adding inline to functions that are static and used
				928	only once is always a win since there is no space tradeoff. While this is
				929	technically correct, gcc is capable of inlining these automatically without
				930	help, and the maintenance issue of removing the inline when a second user
				931	appears outweighs the potential value of the hint that tells gcc to do
				932	something it would have done anyway.
				933
				934
				935	16) Function return values and names
				936	------------------------------------
				937
				938	Functions can return values of many different kinds, and one of the
				939	most common is a value indicating whether the function succeeded or
				940	failed. Such a value can be represented as an error-code integer
				941	(-Exxx = failure, 0 = success) or a ``succeeded`` boolean (0 = failure,
				942	non-zero = success).
				943
				944	Mixing up these two sorts of representations is a fertile source of
				945	difficult-to-find bugs. If the C language included a strong distinction
				946	between integers and booleans then the compiler would find these mistakes
				947	for us... but it doesn't. To help prevent such bugs, always follow this
				948	convention::
				949
				950	If the name of a function is an action or an imperative command,
				951	the function should return an error-code integer. If the name
				952	is a predicate, the function should return a "succeeded" boolean.
				953
				954	For example, ``add work`` is a command, and the add_work() function returns 0
				955	for success or -EBUSY for failure. In the same way, ``PCI device present`` is
				956	a predicate, and the pci_dev_present() function returns 1 if it succeeds in
				957	finding a matching device or 0 if it doesn't.
				958
				959	All EXPORTed functions must respect this convention, and so should all
				960	public functions. Private (static) functions need not, but it is
				961	recommended that they do.
				962
				963	Functions whose return value is the actual result of a computation, rather
				964	than an indication of whether the computation succeeded, are not subject to
				965	this rule. Generally they indicate failure by returning some out-of-range
				966	result. Typical examples would be functions that return pointers; they use
				967	NULL or the ERR_PTR mechanism to report failure.
				968
				969
David Brazdil	0f672f6	2019-12-10 10:32:29 +0000	[diff] [blame]	970	17) Using bool
				971	--------------
				972
				973	The Linux kernel bool type is an alias for the C99 _Bool type. bool values can
				974	only evaluate to 0 or 1, and implicit or explicit conversion to bool
				975	automatically converts the value to true or false. When using bool types the
				976	!! construction is not needed, which eliminates a class of bugs.
				977
				978	When working with bool values the true and false definitions should be used
				979	instead of 1 and 0.
				980
				981	bool function return types and stack variables are always fine to use whenever
				982	appropriate. Use of bool is encouraged to improve readability and is often a
				983	better option than 'int' for storing boolean values.
				984
				985	Do not use bool if cache line layout or size of the value matters, as its size
				986	and alignment varies based on the compiled architecture. Structures that are
				987	optimized for alignment and size should not use bool.
				988
				989	If a structure has many true/false values, consider consolidating them into a
				990	bitfield with 1 bit members, or using an appropriate fixed width type, such as
				991	u8.
				992
				993	Similarly for function arguments, many true/false values can be consolidated
				994	into a single bitwise 'flags' argument and 'flags' can often be a more
				995	readable alternative if the call-sites have naked true/false constants.
				996
				997	Otherwise limited use of bool in structures and arguments can improve
				998	readability.
				999
				1000	18) Don't re-invent the kernel macros
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	1001	-------------------------------------
				1002
				1003	The header file include/linux/kernel.h contains a number of macros that
				1004	you should use, rather than explicitly coding some variant of them yourself.
				1005	For example, if you need to calculate the length of an array, take advantage
				1006	of the macro
				1007
				1008	.. code-block:: c
				1009
				1010	#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
				1011
				1012	Similarly, if you need to calculate the size of some structure member, use
				1013
				1014	.. code-block:: c
				1015
Olivier Deprez	157378f	2022-04-04 15:47:50 +0200	[diff] [blame^]	1016	#define sizeof_field(t, f) (sizeof(((t*)0)->f))
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	1017
				1018	There are also min() and max() macros that do strict type checking if you
				1019	need them. Feel free to peruse that header file to see what else is already
				1020	defined that you shouldn't reproduce in your code.
				1021
				1022
David Brazdil	0f672f6	2019-12-10 10:32:29 +0000	[diff] [blame]	1023	19) Editor modelines and other cruft
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	1024	------------------------------------
				1025
				1026	Some editors can interpret configuration information embedded in source files,
				1027	indicated with special markers. For example, emacs interprets lines marked
				1028	like this:
				1029
				1030	.. code-block:: c
				1031
				1032	-- mode: c --
				1033
				1034	Or like this:
				1035
				1036	.. code-block:: c
				1037
				1038	/*
				1039	Local Variables:
				1040	compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c"
				1041	End:
				1042	*/
				1043
				1044	Vim interprets markers that look like this:
				1045
				1046	.. code-block:: c
				1047
				1048	/* vim:set sw=8 noet */
				1049
				1050	Do not include any of these in source files. People have their own personal
				1051	editor configurations, and your source files should not override them. This
				1052	includes markers for indentation and mode configuration. People may use their
				1053	own custom mode, or may have some other magic method for making indentation
				1054	work correctly.
				1055
				1056
David Brazdil	0f672f6	2019-12-10 10:32:29 +0000	[diff] [blame]	1057	20) Inline assembly
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	1058	-------------------
				1059
				1060	In architecture-specific code, you may need to use inline assembly to interface
				1061	with CPU or platform functionality. Don't hesitate to do so when necessary.
				1062	However, don't use inline assembly gratuitously when C can do the job. You can
				1063	and should poke hardware from C when possible.
				1064
				1065	Consider writing simple helper functions that wrap common bits of inline
				1066	assembly, rather than repeatedly writing them with slight variations. Remember
				1067	that inline assembly can use C parameters.
				1068
				1069	Large, non-trivial assembly functions should go in .S files, with corresponding
				1070	C prototypes defined in C header files. The C prototypes for assembly
				1071	functions should use ``asmlinkage``.
				1072
				1073	You may need to mark your asm statement as volatile, to prevent GCC from
				1074	removing it if GCC doesn't notice any side effects. You don't always need to
				1075	do so, though, and doing so unnecessarily can limit optimization.
				1076
				1077	When writing a single inline assembly statement containing multiple
				1078	instructions, put each instruction on a separate line in a separate quoted
				1079	string, and end each string except the last with ``\n\t`` to properly indent
				1080	the next instruction in the assembly output:
				1081
				1082	.. code-block:: c
				1083
				1084	asm ("magic %reg1, #42\n\t"
				1085	"more_magic %reg2, %reg3"
				1086	: /* outputs / : / inputs / : / clobbers */);
				1087
				1088
David Brazdil	0f672f6	2019-12-10 10:32:29 +0000	[diff] [blame]	1089	21) Conditional Compilation
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	1090	---------------------------
				1091
				1092	Wherever possible, don't use preprocessor conditionals (#if, #ifdef) in .c
				1093	files; doing so makes code harder to read and logic harder to follow. Instead,
				1094	use such conditionals in a header file defining functions for use in those .c
				1095	files, providing no-op stub versions in the #else case, and then call those
				1096	functions unconditionally from .c files. The compiler will avoid generating
				1097	any code for the stub calls, producing identical results, but the logic will
				1098	remain easy to follow.
				1099
				1100	Prefer to compile out entire functions, rather than portions of functions or
				1101	portions of expressions. Rather than putting an ifdef in an expression, factor
				1102	out part or all of the expression into a separate helper function and apply the
				1103	conditional to that function.
				1104
				1105	If you have a function or variable which may potentially go unused in a
				1106	particular configuration, and the compiler would warn about its definition
				1107	going unused, mark the definition as __maybe_unused rather than wrapping it in
				1108	a preprocessor conditional. (However, if a function or variable always goes
				1109	unused, delete it.)
				1110
				1111	Within code, where possible, use the IS_ENABLED macro to convert a Kconfig
				1112	symbol into a C boolean expression, and use it in a normal C conditional:
				1113
				1114	.. code-block:: c
				1115
				1116	if (IS_ENABLED(CONFIG_SOMETHING)) {
				1117	...
				1118	}
				1119
				1120	The compiler will constant-fold the conditional away, and include or exclude
				1121	the block of code just as with an #ifdef, so this will not add any runtime
				1122	overhead. However, this approach still allows the C compiler to see the code
				1123	inside the block, and check it for correctness (syntax, types, symbol
				1124	references, etc). Thus, you still have to use an #ifdef if the code inside the
				1125	block references symbols that will not exist if the condition is not met.
				1126
				1127	At the end of any non-trivial #if or #ifdef block (more than a few lines),
				1128	place a comment after the #endif on the same line, noting the conditional
				1129	expression used. For instance:
				1130
				1131	.. code-block:: c
				1132
				1133	#ifdef CONFIG_SOMETHING
				1134	...
				1135	#endif /* CONFIG_SOMETHING */
				1136
				1137
				1138	Appendix I) References
				1139	----------------------
				1140
				1141	The C Programming Language, Second Edition
				1142	by Brian W. Kernighan and Dennis M. Ritchie.
				1143	Prentice Hall, Inc., 1988.
				1144	ISBN 0-13-110362-8 (paperback), 0-13-110370-9 (hardback).
				1145
				1146	The Practice of Programming
				1147	by Brian W. Kernighan and Rob Pike.
				1148	Addison-Wesley, Inc., 1999.
				1149	ISBN 0-201-61586-X.
				1150
				1151	GNU manuals - where in compliance with K&R and this text - for cpp, gcc,
Olivier Deprez	157378f	2022-04-04 15:47:50 +0200	[diff] [blame^]	1152	gcc internals and indent, all available from https://www.gnu.org/manual/
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	1153
				1154	WG14 is the international standardization working group for the programming
				1155	language C, URL: http://www.open-std.org/JTC1/SC22/WG14/
				1156
David Brazdil	0f672f6	2019-12-10 10:32:29 +0000	[diff] [blame]	1157	Kernel :ref:`process/coding-style.rst <codingstyle>`, by greg@kroah.com at OLS 2002:
Andrew Scull	b4b6d4a	2019-01-02 15:54:55 +0000	[diff] [blame]	1158	http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/