Documentation conventions¶
Summary points / highlights¶
Refer to the Google documentation guide; this document only extends Google’s. This document is longer than originally planned, so you can ignore most of it. The crucial points are highlighted below.
Filenames:
- Use lowercase-kebab-case, limiting characters to
[a-z0-9-]
(and.
for filename extensions). - Use
LICENSE.txt
overLICENSE
andconfig.yaml
overconfig.yml
.
Language:
- Be explicit and precise, especially about edge cases.
- An example, code sample, table, or accessible diagram is often sufficient.
Markdown:
- Start a new line for each sentence (it helps with diffs). Aim to keep lines under 100 characters, by breaking at sensible places.
- Use
**bold**
for emphasis; and use_italics_
as the<i>
element.
Comments (in code):
- Don’t write useless comments just to follow a convention.
Quantities:
- Use
54 kg
, not54kg
. - For uncertainties, state what the numbers mean. E.g.: 1 ± 1 (SD), 1 ± 1 (SE), 1 (0–2, 90% CI), or 0–2 over 100 trials.
Filenames¶
These are general guidelines: Alternatives should be used in some situations. For example, if camelCase is used in your JSON Schema, use camelCase for schema document filenames.
Rationale
- The official YAML extension is
.yaml
. The IANA media types areapplication/yaml
,text/html
, andimage/jpeg
..yml
,.htm
, and.jpg
are relics of DOS. - Extensions prominently show essential information. For example, a file named
info
could be a plain-text info document or a shell script that writes the info.
Preferred¶
Tip
Also apply this section to URI nodes, database IDs, and similar constructs.
Use kebab-case (e.g. full-document.pdf
), treating -
as a space. If necessary, --
, +
, ,
, and/or ~
can be used as specialized word separators. For example, +
could denote joint authorship in mary-johnson+kerri-swanson-document.pdf
. If you can, restrict to [a-z0-9]
. Capital letters may be useful in some cases, but use them with reservation.
Always include filename extensions (except perhaps for Unix executable files); e.g. LICENSE.txt
or LICENSE.md
, not LICENSE
. Use .yaml
for YAML, .html
for HTML, and .jpeg
for JPEG.
Strongly recommended¶
Never start a name with -
/~
or end it with .
/~
. Don’t use problematic characters such as quotation marks, semicolons, or colons. Also don’t use reserved names, including .
, ..
, or Windows reserved filenames. Limit names to 255 characters.
Pro-tip – pathvalidate
The PyPi package pathvalidate will validate paths for all major operating systems and filesystem types by default. You can use the CLI, pathvalidate-cli (which returns nonzero on error). Example:
Comments¶
Comments must be maintained like all other elements of code. Avoid unnecessary comments, such as those added out of habit or ritual. Forgo comments that are obvious or otherwise unhelpful.
Examples
class SpecialCache[T]:
"""
Soft cache supporting the Foobar backend.
Uses a Least Recently Used (LRU) policy with an expiration duration.
"""
def get_cache_item(self, key: str) -> T:
# (1)!
"""
Gets the cache item corresponding to key `key`.
Arguments:
key: A string-valued key
Returns:
The value for the key `key`
"""
return self._items[key]
- The docstring argument list and return description serve no function.
Language and grammar¶
Apply these guidelines to both comments and documentation. See Google’s documentation style guide for additional guidelines.
Style¶
Remove text that is repetitive or superfluous. Be direct. Use examples, diagrams, formal grammars, pseudocode, and mathematical expressions. Write English descriptions for any elements that are probably inaccessible to screen readers.
Keep language accessible: Introduce or explain jargon, favor simpler words, and replace idioms with literal phrasing. Do not rely on mutual agreement about subtle differences between similar words. For example, do not use both significant and substantial. To distinguish a very significant finding from a significant one, write very significant. Be explicit. Use singular they and other gender-neutral terms, and use inclusive language. Substitute long phrases with shorter ones.
Great documentation should not win poetry awards. Keep things simple and direct.
Examples: overlong phrases
❌ Avoid | ✅ Preferred |
---|---|
utilize | use |
due to the fact that | because |
a great number of | many |
is able to | can |
needless to say | (omit) |
it is important to note that | importantly, |
Spelling¶
Use American English spelling. American English is the most widespread dialect, and it generally has more phonetic and shorter spellings.
Grammar and punctuation¶
Use 1 space between sentences. Use sentence case for titles and table headers (e.g. This is a title). Capitalize the first word after a colon only if it begins a complete sentence; do not capitalize the first word after a semicolon.
Terminology¶
Prefer the term URI, over the terms URL and URN.
Rationale
From RFC 3986 §1.1.3:
Future specifications and related documentation should use the general term “URI” rather than the more restrictive terms “URL” and “URN” [RFC3305].
Markdown¶
Where applicable, apply these guidelines to other documentation, not just Markdown.
Line breaks¶
Start each sentence on a new line.
Rationale
Keeping each sentence on its own line dramatically simplifies diffs.
Note
Apparently, I’m not the only one who figured out how helpful this is. Developer Brandon Rhodes named this semantic linefeeds. He traced the idea to an excerpt from Brian Kernighan’s UNIX for beginners that discusses manual editing rather than computational diffs:
Start each sentence on a new line. Make lines short, and break lines at natural places, such as after commas and semicolons, rather than randomly. Since most people change documents by rewriting phrases and adding, deleting and rearranging sentences, these precautions simplify any editing you have to do later.
Incorrect \n
treatment in GitHub Issues, Discussions, and PRs
As recently as 2024-12, GitHub incorrectly renders \n
as <br>
in Discussions, Issues, and Pull Requests. Per the original Markdown spec, CommonMark, and GitHub Flavored Markdown, a non-consecutive \n
is a soft line break, either a space or \n
in HTML. In contrast, GitHub uses the correct behavior for .md
files (e.g. README.md
). Obviously, prioritize readability of the rendered document over readability of the source.
If needed to prevent a line from exceeding 100 characters, add line breaks elsewhere. Look for one of these places to add a line break:
- Before and/or after a Markdown link
- Before an opening HTML tag (or Markdown equivalent) or after a closing tag
- After punctuation that begins an independent clause
- At another natural place
Also add a line break somewhere if you think it’s helpful. In particular, consider around a long Markdown link, or before items of an inline list. For example:
You must abide by
[ARPA, 16 U.S.C. §§ 470aa–470mm](https://uscode.house.gov/view.xhtml?path=/prelim@title16/chapter1B&edition=prelim),
particularly
§ 470hh, “Confidentiality of information concerning nature and location of archeological resources”
and
§ 470ii, “Rules and regulations; intergovernmental coordination”.
Note
Don’t obsess over line lengths! A line of 120 characters is unlikely to kill or dismemeber anyone.
Text styles and semantics¶
Rationale
This balancces between ease of writing Markdown, readability of the rendered HTML, and semantic precision.
Some semantic HTML elements are result in identical styles (in most browsers). This table summarizes how to use them in HTML and in Markdown.
element | style | usages | equivalent Markdown |
---|---|---|---|
em | italic | emphasis; stress | <em>text</em> (prefer bold) |
i | italic | technical terms; foreign text; more | _text_ |
var | italic | terms being defined | _term_ or <dfn>term</dfn> |
var | italic | variables | $var$ |
strong | bold | strong emphasis | **text** |
b | bold | keywords, miscellaneous | <b>text</b> |
code | monospace | source code | `code` |
samp | monospace | sample output | `code` |
kbd | monospace | keyboard keys | <kdb>keys</kbd> |
Italics¶
Take _
/_
(and *
/*
) as semantically equivalent to the <i>
element.
- Foreign words, technical terms, etc.
- Literal references to words/phrases themselves. Importantly, single quotation marks (
‘
/’
) or Markdown backticks may also be used.
- Ubiquitous foreign phrases like in vivo, in sitro, in silico, and _et al.; no markup is needed.
- Emphasis or importance. Instead, use bold
**
/**
. - Stress (e.g. to distinguish
<<I>> will go there
andI will go <<there>>
.). (Refer to the following section.)
[Linguistic stress](https://en.wikipedia.org/wiki/Stress_(linguistics) is usually marked using italics. Consider the difference between _I_ will go there
and I will go _there_
. The italicization is essential to the meaning. This can lead to confusion if read as plaintext or by a screen reader. It’s best to make the exact meaning explicit by rephrasing: e.g. I – specifically I – will go there.
If you need to use italics for stress, prefer explicit <em></em>
.
The dfn
element can be used in Markdown. _
/_
is an acceptable fallback.
Bold¶
Use **
/**
bold for emphasis. Take **
/**
as semantically equivalent to the <strong>
element.
Use the <b>
element explicitly rather than **
/**
for text that should be bold but not emphasized – i.e. semantically distinct from the surrounding text. For example, you might write <b>Score:</b> 55.3%
.
Code and math¶
Tip: Smarty plugin
With the Material for mkdocs Smarty plugin, you can use ++
/++
instead of <kdb>
/<kdb>
; e.g. ++ctrl+alt+del++
.
Use backticks for code, <kbd>
/</kdb>
for keyboard keys. For math, use LaTeX inline $
/$
for single-line and $$
/$$
for multi-line.
Menu navigation¶
To describe menu navigation, use ➤
in italics; e.g. File ➤ Export ➤ Export as Text. Try to use the exact words, capitalization, and punctuation. For example, write File ➤ Settings… ➤ Advanced if the menu uses ...
(and not U+2026 / …
).
Encoding¶
Write most non-ASCII characters as-is, not with entity references. For example, write an en dash as –
, not –
.
Except, use hexadecimal entity references for
- non-space whitespace characters; and (such as no break space,
); - punctuation that is highly or extremely likely to confuse anyone reading the source code (such as soft hyphen,
­
); and - characters that must be escaped for technical reasons
Unicode characters¶
Tip: Smarty plugin
With the Material for mkdocs Smarty plugin, you can use use '
and "
for quotation marks, --
for en dashes, ---
for em dashes, and ...
for ellipses.
Use the correct Unicode characters for punctuation. (Of course, use regular hyphen-minus (U+002D) for hyphens, not hyphen (U+2010).))
Examples
’
for apostrophes‘
,’
,“
, and”
for quotation marks–
(en dash) for numerical ranges (e.g.5–10
)—
(em dash) to separate a blockquote and its source‒
(figure dash) in numerical formatting…
(ellipses)−
(minus sign)µ
(micro sign)
Punctuation (prescriptive grammar)¶
Use an en dash surrounded by spaces (–
) to mark breaks in thoughts, not an em dash. For example:
For i.e. and e.g., skip the comma (British English) and normally introduce with ;. For example: say something nice; e.g. “nice boots”.
.
Abbreviations¶
For the first appearance, consider writing it out in this format: Public Library of Science (PLOS) †. Omit periods (.
) for initialisms; e.g. USA, not U.S.A..
† Note that the correct abbreviation for PLOS is PLOS, not PLoS.
You can use, e.g., the Material for MkDocs Abbreviations extension <abbr>
, or just an HTML tag with the title
attribute.
Admonitions¶
Use Material for mkdocs admonitions, GitHub-Flavored Markdown alerts, or other “admonition” syntax the same way as the <aside>
element. The content should be (A) “offset” or “aside”, (B) important when skimmed, or (C) metatext (describes its surrounding content).
Footnotes¶
Rationale
These symbols are easily recognized as indicating footnotes. Superscript numbers could be confused with reference numbers or even exponents in some places, and they are less accessible for people who are vision-impaired.
- The asterisk
*
has other very common meanings and is inconvenient in Markdown. (The U+204E⁎
avoids the second problem.) - The double vertical bar
‖
has other very common meanings. - The pilcrow
¶
is too heavy and prominent in most fonts. - Doubled-up footnote symbols mean you have too many footnotes.
Use either a Markdown extension for footnotes, or follow this format:
The symbols don’t need to be superscripted. Placement is discretionary: position before or after punctuation, with or without a space. These symbols are recommended, in order: †
(dagger), ‡
(double dagger), §
(section mark), ♯
(musical sharp), ♮
(musical natural), and ◊
(lozenge).
You can modify this list if needed, or use another scheme like superscript numbers, superscript lowercase letters, etc.
Citations¶
For references, use the IEEE inline style: [1]
, [2]
, etc. IEEE style is also encouraged for bibliographies.
Quotations¶
Rationale
This preserves the semantic difference between punctuation inside and outside of quotations. This rule is always followed when using code in backticks, anyway.
Place punctuation outside of quotation marks (British-style rules). For example, in Also write ‘hard’, ‘difficult’, or ‘strenuous’.
Introduce code blocks with punctuation only where semantically valid. If it is semantically valid, use a colon rather than a comma. In blockquotes, use _— author_
(with an em dash) to cite the source.
Examples
In the following block, use then run, not then run:.
However, use a colon here:
Inline lists (enumerations)¶
For inline lists, follow this format:
(1) Use (1)
, (a)
, (A)
, (i)
, or (I)
; (2) use commas or semicolons; (3) start a line for each item; (4) specify and
, or
, or nor
; and (5) end the last line with the applicable punctuation.
Formatting numbers¶
Numbers, units, dimensioned quantities, ranges, and measurement uncertainties must be readable. This guide follows the IEEE recommendation. Also see NIST note 1297 and the Guides in Metrology.
!!! tip: “Tip: LuaTeX, siunitx, and fontspec”
LuaTeX, [siunitx](https://ctan.org/pkg/siunitx), and [fontspec](https://ctan.org/pkg/fontspec)
is the ideal solution, but it is not usable in Markdown.
You need a suitable font, and `fontspec`, `\unimathsetup` and `\sisetup` as shown in
[`fontworthy.sty`](https://github.com/dmyersturnbull/desert-latex/blob/main/src/fontworthy.sty).
Numbers¶
Use:
- A period (
.
) as the decimal separator - A narrow no break space, (
, U+002D/
) as the thousands separator
Example:
- ✅ Preferred
1 024.222 222
(result: 1 024.222 222) - ✅ Preferred (LaTeX) `\(1~024.222~222\) (result: \(1~024.222~222\))
- ✅ Preferred (SIUnitX)
$\num{1024.222222}$
- 🟨 Acceptable
1024.222222
(result: 1024.222222) - ❌ Not acceptable
1,024.222222
or1.024,222222
Dimensioned quantities¶
Use:
- A normal space or a no-breaking space (
/U+00A0/
) to separate magnitude and units. - A middle dot (
·
/U+00B7/·
) to multiply units. - Unicode superscript digits (
⁰¹²³⁴⁵⁶⁷⁸⁹
) and minus sign (⁻
) for unit exponents.
Examples:
- ✅ Preferred
5 kg·m²·s⁻²
(result: 5 kg·m²·s⁻²) - ✅ Preferred (LaTeX)
$5~\mathrm{kg\cdot m^2\cdot s^{-2}}$
(result: \(5~\mathrm{kg \cdot m^2 \cdot s^{-2}}\)) - ✅ Preferred (SIUnitX)
$\qty{5 \kilogram\meter\squared\per\second\squared}$
- 🟨 Acceptable
5 kg m^2 / s^2
(result: 5 kg m^2 / s^2) - ❌ Not acceptable
5 kg*m^2/s^2
(result: 5 kg*m2/s2)
Specific cases¶
- For the micro SI prefix, use the micro sign,
μ
/U+00B5. † - For Ohms, use the Greek capital letter omega,
Ω
/U+03A9 † - Write
50 kibibytes
or50 kiB
, using the base-2 ISO/IEC 80000 standard, which includes prefixes kibi- (Ki), mebi- (Mi), gibi- (Gi), tebi- (Ti), etc. - You may use either
50%
(widely used) or50 %
(NIST-recommended). - For angles, omit a space before
°
as in90°
. - Prefer decimal degrees notation – e.g.
90° 30′ 15″
– for longitude and latitude. Use no-break spaces (/U+00A0/
), and proper symbols for prime (′
/U+2023/′
) and double prime (″
/U+2033/″
).
† Why U+00B5 micro and not U+03BC mu?
Use the micro sign instead of the greek letter mu, but capital omega instead of the ohm sign. From the Unicode spec, “Greek Letters as Symbols”:
For compatibility purposes, a few Greek letters are separately encoded as symbols in other character blocks. Examples include U+00B5 µ MICRO SIGN in the Latin-1 Supplement character block and U+2126 Ω OHM SIGN in the Letterlike Symbols character block. The ohm sign is canonically equivalent to the capital omega, and normalization would remove any distinction. The Ohm sign’s use is therefore discouraged in favor of capital omega. The same equivalence does not exist between micro sign and mu, and use of either character as a micro sign is common.
Uncertainty measurements¶
Mean ± error¶
State whether a value means standard error or standard deviation. You can use the abbreviations standard error (SE) and standard deviation (SD).
Example
Is 0.1
standard deviation or standard error?
- ❌ Incorrect
5.0 ±0.1
- ✅ Correct
5.0 ±0.1 (SE)
- ✅ Also correct
5.0 ±0.1 (SD)
Use one of these formats:
- Standard error: 7.65 ±1.2 (SE)
- Standard deviation: 7.65 ±0.54 (SD)
- SE/SD with units: (7.65 ±0.54) J·m⁻² (SD)
Ranges and confidence intervals¶
Don’t just write 56–81
. Is that an exact range from a sample or a confidence interval?
- If it’s over a sample, where values ranged between
56
and81
, explain that. Include the sample size or number of measurements; e.g. n = 100 machines or n = 100 repeat measurements. - If it’s a confidence interval, include the level of confidence. You can use the abbreviation confidence interval (CI).
Example
What does 1–2 hours mean?
- ❌ Incorrect Half-life is 1–2 hours.
- ✅ Correct – range _Half-life was 1–2 hours among 58 participants.
- ✅ Correct – CI _Mean half-life is 1.4 hours (0.9–1.9, 80% CI).
Use one of these formats:
- CI: 7.65 (4.0–12.5, 95% CI)
- CI with units (variant 1, simplest): 7.65 (4.0–12.5, 95% CI) J·m⁻²
- CI with units (variant 2): 7.65 J·m⁻² (4.0–12.5, 95% CI)
- CI with units (variant 3): 7.65 J·m⁻² (4.0 to 12.5, 95% CI)
Dates and times¶
Use RFC 3339; e.g. 2023-11-02T14:55:00-08:00
. Note that the UTC offset is written with a hyphen, not a minus sign. If a timezone is needed, use a Canonical IANA timezone such as America/Los_Angeles
, and set it in square brackets after the UTC offset. For example: 2023-11-02T14:55:00 -08:00 [America/Los_Angeles]
.
Durations and intervals¶
For durations, use 8.3 s. hr
, min
, and sec
/s
are acceptable abbreviations, but M
for minute is not. hh:mm:ss
(e.g. 12:30:55
) is generally ok, but do not use mm:ss
or hh:mm
– these are ambiguous. Also do not use the ISO 8601’s P
/PT
duration syntax (e.g. PT45M55S
) in documentation.
Filesystem paths and trees¶
Always use /
as a path separator in documentation, and denote directories with a trailing /
.
For filesystem trees, use Unicode box-drawing characters. Refer to the research projects guide for an example.
Accessibility¶
Use descriptive titles for link titles.
- ✅ Correct
Refer to the [documentation conventions](documentation.md).
- ❌ Incorrect
Click [here](documentation.md) for conding conventions.
HTML¶
Follow the applicable guidelines from the Markdown section.
Attributes¶
Use kebab-case for id
and name
values and for data
keys and values. Use the alt
attribute for media elements, including <img>
, <video>
, <audio>
, and <canvas>
.
Formatting¶
Use Prettier with default options except for line length, which must be 100. Note that Prettier wraps tags in way that looks strange at first; it does that to avoid adding extra whitespace.
Closing tags¶
Always include the <html>
, <head>
, and <body>
elements. Also, always close tags – for example, use <p>The end.</p>
, not <p>The end.
.
These two practices improve readability and massively simplify parsing. The rules for omitting <html>
, etc. are also complex and better ignored.
Formal grammars¶
Grammars may be specified in any well-defined meta-grammar. Specify the syntax used.
Notable good options:
With ABNF, avoid the incremental alternatives notation (=/
). Because this modifies an already-defined rule, it complicates reading. Also avoid the core rules CHAR
, LWSP
, CTL
, VCHAR
, and WSP
, which are misleading because they are restricted ASCII.
[IEC
]: International Electrotechnical Commission