Documentation conventions¶
Summary points / highlights¶
These guidelines are aggressive and lengthy (more than originally intended). Skip the details gloss over the details. Here are the important points.
Filenames:
- Use lowercase-kebab-case, limiting characters to
[a-z0-9-]
. - Use
LICENSE.txt
andconfig.yaml
overLICENSE
andconfig.yml
.
Language:
- Keep language simple, and be explicit.
- Sometimes an example is sufficient.
Markdown:
- Start a new line for every sentence (it helps with diffs).
- Limit lines to 120 characters, breaking at sensible places.
- Use
**bold**
for emphasis,_italics_
as the<i>
element, and<b></b>
for non-emphasized but bold text; e.g.<b>Score:</b> 12.5
.
Comments (in code):
- Forgo comments that are superfluous or included out of habit or convention.
Filenames¶
This section can apply to naming of URI nodes, database IDs, and similar constructs. These are general guidelines: Alternatives should be used in some situations. For example, if camelCase is used in your JSON Schema, use camelCase for schema document filenames.
Rationale
The official YAML extension is .yaml
. Moreover, the IANA media types are application/yaml
, text/html
, and image/jpeg
. .yml
, .htm
, and .jpg
are relics of DOS. Extensions prominently show essential information, and ommitting them can cause confusion. For example, a file named info
could be a plain-text info document or a shell script that writes the info. Instead, write it as info.txt
or info.sh
.
Prefer kebab-case (e.g. full-document.pdf
), treating -
as a space. Restrict to -
, .
, [a-z]
, and [0-9]
, unless there is a compelling reason otherwise. If necessary, --
, +
, and ~
can be used as specialized word separators. For example, +
could denote joint authorship in mary-johnson+kerri-swanson-document.pdf
.
Always use one or more filename extensions, except for executable files; e.g. LICENSE.txt
or LICENSE.md
, not LICENSE
. Where possible, use .yaml
for YAML, .html
for HTML, and .jpeg
for JPEG. In particular, do not use .yml
, .htm
, .jpg
, or .jfif
.
Comments¶
Comments must be maintained like all other elements of code. Avoid unnecessary comments, such as those added out of habit or ritual. Forgo comments that are obvious or otherwise unhelpful.
Examples
class SpecialCache[T]:
"""
Soft cache supporting the Foobar backend.
Uses a Least Recently Used (LRU) policy with an expiration duration.
"""
def get_cache_item(selfself, key: str) -> T:
# (1)!
"""
Gets the cache item corresponding to key `key`.
Arguments:
key: A string-valued key
Returns:
The value for the key `key`
"""
return self._items[key]
- The docstring argument list and return description serve no function.
Language and grammar¶
Apply these guidelines to both comments and documentation. See Google’s documentation style guide for additional guidelines.
Style¶
Remove text that is repetitive or superfluous. Be direct. Use examples, diagrams, formal grammars, pseudocode, and mathematical expressions. Write English descriptions for diagrams and any other elements that are be inaccessible to screen readers.
Keep language accessible: Introduce or explain jargon, favor simpler words, and replace idioms with literal phrasing. Do not rely on mutual agreement about subtle differences between similar words. For example, do not use both significant and substantial. To distinguish a very significant finding from a significant one, write very significant. Be explicit. Use singular they and other gender-neutral terms, and use inclusive language. Substitute long phrases with shorter ones.
Great documentation should not win poetry awards. Keep things simple and direct.
Examples: overlong phrases
❌ Avoid | ✅ Preferred |
---|---|
utilize | use |
due to the fact that | because |
a great number of | many |
is able to | can |
needless to say | (omit) |
it is important to note that | importantly, |
Spelling¶
Use American English spelling. American English is the most widespread dialect, and it generally has more phonetic and shorter spellings.
Grammar and punctuation¶
Use 1 space between sentences.
Use sentence case for titles and table headers (e.g. This is a title). Capitalize the first word after a colon only if it begins a complete sentence; do not capitalize the first word after a semicolon.
Terminology¶
Prefer the term URI, over the terms URL and URN.
Rationale
From RFC 3986 §1.1.3:
Future specifications and related documentation should use the general term “URI” rather than the more restrictive terms “URL” and “URN” [RFC3305].
Markdown¶
Where applicable, apply these guidelines to other documentation, not just Markdown.
Line breaks¶
Start each sentence on a new line.
Rationale
Keeping each sentence on its own line dramatically simplifies diffs.
Incorrect \n
treatment in GitHub Issues, Discussions, and PRs
As recently as 2024-12, GitHub incorrectly renders \n
as <br>
in Discussions, Issues, and Pull Requests. Per the original Markdown spec, CommonMark, and GitHub Flavored Markdown, a non-consecutive \n
is a soft line break, either a space or \n
in HTML. In contrast, GitHub uses the correct behavior for .md
files (e.g. README.md
). Obviously, prioritize readability of the rendered document over readability of the source.
If needed to prevent a line from exceeding 120 characters, add line breaks elsewhere. Look for one of these places to add a line break:
- Before and/or after a Markdown link
- Before an opening HTML tag (or Markdown equivalent) or after a closing tag
- After punctuation that begins an independent clause
- At another natural place
Also add a line break somewhere if you think it’s helpful. In particular, consider around a long Markdown link, or before items of an inline list. For example:
You must abide by
[ARPA, 16 U.S.C. §§ 470aa–470mm](https://uscode.house.gov/view.xhtml?path=/prelim@title16/chapter1B&edition=prelim),
particularly
§ 470hh, “Confidentiality of information concerning nature and location of archeological resources”
and
§ 470ii, “Rules and regulations; intergovernmental coordination”.
Text styles and semantics¶
Rationale
This balancces between ease of writing Markdown, readability of the rendered HTML, and semantic precision.
Some semantic HTML elements are result in identical styles (in most browsers). This table summarizes how to use them in HTML and in Markdown.
element | style | usages | equivalent Markdown |
---|---|---|---|
em | italic | emphasis; stress | <em>text</em> (prefer bold) |
i | italic | technical terms; foreign text; more | _text_ |
var | italic | terms being defined | _term_ or <dfn>term</dfn> |
var | italic | variables | $var$ |
strong | bold | strong emphasis | **text** |
b | bold | keywords, miscellaneous | <b>text</b> |
code | monospace | source code | `code` |
samp | monospace | sample output | `code` |
kbd | monospace | keyboard keys | <kdb>keys</kbd> |
Italics¶
Take _
/_
(and *
/*
) as semantically equivalent to the <i>
element.
- Foreign words, technical terms, etc.
- Literal references to words/phrases themselves. Importantly, single quotation marks (
‘
/’
) or Markdown backticks may also be used.
- Ubiquitous foreign phrases like in vivo, in sitro, in silico, and _et al.; no markup is needed.
- Emphasis or importance. Instead, use bold
**
/**
. - Stress (e.g. to distinguish
<<I>> will go there
andI will go <<there>>
.). (Refer to the following section.)
[Linguistic stress](https://en.wikipedia.org/wiki/Stress_(linguistics) is usually marked using italics. Consider the difference between _I_ will go there
and I will go _there_
. The italicization is essential to the meaning. This can lead to confusion if read as plaintext or by a screen reader, which may not announce the italicization. It’s best to make the exact meaning explicit by rephrasing, such as in I specifically will go there.
If you need to use italics for stress, prefer explicit <em></em>
.
The dfn
element can be used in Markdown. _
/_
is an acceptable fallback.
Bold¶
Use **
/**
bold for emphasis. Take **
/**
as semantically equivalent to the <strong>
element.
Use the <b>
element explicitly rather than **
/**
for text that should be bold but not emphasized – i.e. semantically distinct from the surrounding text. For example, you might write <b>Score:</b> 55.3%
.
Code and math¶
Tip: Smarty plugin
With the Material for mkdocs Smarty plugin, you can use ++
/++
instead of <kdb>
/<kdb>
; e.g. ++ctrl+alt+del++
.
Use backticks for code, <kbd>
/</kdb>
for keyboard keys. For math, use LaTeX inline $
/$
for single-line and $$
/$$
for multi-line.
Menu navigation¶
To describe menu navigation, use ➤
in italics; e.g. File ➤ Export ➤ Export as Text. Try to use the exact words, capitalization, and punctuation. For example, write File ➤ Settings… ➤ Advanced if the menu uses ...
(and not U+2026 / …
).
Encoding¶
Write most non-ASCII characters as-is, not with entity references. For example, write an en dash as –
, not –
.
Except, use hexadecimal entity references for
- non-space whitespace characters; and (such as no break space,
); - punctuation that is highly or extremely likely to confuse anyone reading the source code (such as soft hyphen,
­
); and - characters that must be escaped for technical reasons
Unicode characters¶
Tip: Smarty plugin
With the Material for mkdocs Smarty plugin, you can use use '
and "
for quotation marks, --
for en dashes, ---
for em dashes, and ...
for ellipses.
Use the correct Unicode characters for punctuation. (Of course, use regular hyphen-minus (U+002D) for hyphens, not hyphen (U+2010).))
Examples
’
for apostrophes‘
,’
,“
, and”
for quotation marks–
(en dash) for numerical ranges (e.g.5–10
)—
(em dash) to separate a blockquote and its source‒
(figure dash) in numerical formatting…
(ellipses)−
(minus sign)µ
(micro sign)
Punctuation (prescriptive grammar)¶
Use an en dash surrounded by spaces (–
) to mark breaks in thoughts, not an em dash. For example:
For i.e. and e.g., skip the comma (British English) and normally introduce with ;. For example: say something nice; e.g. “nice boots”.
.
Abbreviations¶
For the first appearance, consider writing it out in this format: Public Library of Science (PLOS) †. Omit periods (.
) for initialisms; e.g. USA, not U.S.A..
† Note that the correct abbreviation for PLOS is PLOS, not PLoS.
You can use, e.g., the Material for MkDocs Abbreviations extension <abbr>
, or just an HTML tag with the title
attribute.
Admonitions¶
Use Material for mkdocs admonitions, GitHub-Flavored Markdown alerts, or other “admonition” syntax the same way as the <aside>
element. The content should be (A) “offset” or “aside”, (B) important when skimmed, or (C) metatext (describes its surrounding content).
Footnotes¶
Rationale
These symbols are easily recognized as indicating footnotes. Superscript numbers could be confused with reference numbers or even exponents in some places, and they are less accessible for people who are vision-impaired.
- The asterisk
*
has other very common meanings and is inconvenient in Markdown. (The U+204E⁎
avoids the second problem.) - The double vertical bar
‖
has other very common meanings. - The pilcrow
¶
is too heavy and prominent in most fonts. - Doubled-up footnote symbols mean you have too many footnotes.
Use either a Markdown extension for footnotes, or follow this format:
The symbols don’t need to be superscripted. Placement is discretionary: position before or after punctuation, with or without a space. These symbols are recommended, in order: †
(dagger), ‡
(double dagger), §
(section mark), ♯
(musical sharp), ♮
(musical natural), and ◊
(lozenge).
You can modify this list if needed, or use another scheme like superscript numbers, superscript lowercase letters, etc.
Citations¶
For references, use the IEEE inline style: [1]
, [2]
, etc. IEEE style is also encouraged for bibliographies.
Quotations¶
Rationale
This preserves the semantic difference between punctuation inside and outside of quotations. This rule is always followed when using code in backticks, anyway.
Place punctuation outside of quotation marks (British-style rules). For example, in Also write ‘hard’, ‘difficult’, or ‘strenuous’.
Introduce code blocks with punctuation only where semantically valid. If it is semantically valid, use a colon rather than a comma. In blockquotes, use _— author_
(with an em dash) to cite the source.
Examples
In the following block, use then run, not then run:.
However, use a colon here:
Inline lists (enumerations)¶
For inline lists, follow this format:
(1) Use (1)
, (a)
, (A)
, (i)
, or (I)
; (2) use commas or semicolons; (3) start a line for each item; (4) specify and
, or
, or nor
; and (5) end the last line with the applicable punctuation.
Formatting numbers¶
Numbers, units, dimensioned quantities, ranges, and measurement uncertainties must be readable. This guide follows the IEEE recommendation. Also see NIST note 1297 and the Guides in Metrology.
!!! tip: “Tip: LuaTeX, siunitx, and fontspec”
LuaTeX, [siunitx](https://ctan.org/pkg/siunitx), and [fontspec](https://ctan.org/pkg/fontspec)
is the ideal solution, but it is not usable in Markdown.
You need a suitable font, and `fontspec`, `\unimathsetup` and `\sisetup` as shown in
[`fontworthy.sty`](https://github.com/dmyersturnbull/desert-latex/blob/main/src/fontworthy.sty#L220).
Numbers¶
Use:
- A period (
.
) as the decimal separator - A narrow no break space, (
, U+002D/
) as the thousands separator
Example:
- ✅ Preferred
1 024.222 222
(result: 1 024.222 222) - ✅ Preferred (LaTeX) `\(1~024.222~222\) (result: \(1~024.222~222\))
- ✅ Preferred (SIUnitX)
$\num{1024.222222}$
- 🟨 Acceptable
1024.222222
(result: 1024.222222) - ❌ Not acceptable
1,024.222222
or1.024,222222
Dimensioned quantities¶
Use:
- A normal space or a no-breaking space (
/U+00A0/
) to separate magnitude and units. - A middle dot (
·
/U+00B7/·
) to multiply units. - Unicode superscript digits (
⁰¹²³⁴⁵⁶⁷⁸⁹
) and minus sign (⁻
) for unit exponents.
Examples:
- ✅ Preferred
5 kg·m²·s⁻²
(result: 5 kg·m²·s⁻²) - ✅ Preferred (LaTeX)
$5~\mathrm{kg\cdot m^2\cdot s^{-2}}$
(result: \(5~\mathrm{kg \cdot m^2 \cdot s^{-2}}\)) - ✅ Preferred (SIUnitX)
$\qty{5 \kilogram\meter\squared\per\second\squared}$
- 🟨 Acceptable
5 kg m^2 / s^2
(result: 5 kg m^2 / s^2) - ❌ Not acceptable
5 kg*m^2/s^2
(result: 5 kg*m2/s2)
Specific cases¶
- For the micro SI prefix, use the micro sign,
μ
/U+00B5. † - For Ohms, use the Greek capital letter omega,
Ω
/U+03A9 † - Write
50 kibibytes
or50 kiB
, using the base-2 ISO/IEC 80000 standard, which includes prefixes kibi- (Ki), mebi- (Mi), gibi- (Gi), tebi- (Ti), etc. - You may use either
50%
(widely used) or50 %
(NIST-recommended). - For angles, omit a space before
°
as in90°
. - Prefer decimal degrees notation – e.g.
90° 30′ 15″
– for longitude and latitude. Use no-break spaces (/U+00A0/
), and proper symbols for prime (′
/U+2023/′
) and double prime (″
/U+2033/″
).
† Why U+00B5 micro and not U+03BC mu?
Use the micro sign instead of the greek letter mu, but capital omega instead of the ohm sign. From the Unicode spec, “Greek Letters as Symbols”:
For compatibility purposes, a few Greek letters are separately encoded as symbols in other character blocks. Examples include U+00B5 µ MICRO SIGN in the Latin-1 Supplement character block and U+2126 Ω OHM SIGN in the Letterlike Symbols character block. The ohm sign is canonically equivalent to the capital omega, and normalization would remove any distinction. [The Ohm sign’s] use is therefore discouraged in favor of capital omega. > The same equivalence does not exist between micro sign and mu, and use of either character as a micro sign is common.
Uncertainty measurements¶
State whether a value means standard error or standard deviation. Do not write 5.0 ±0.1
– that’s ambiguous. You may use the abbreviations standard error (SE), standard deviation (SD), and confidence interval (CI), or spell them out.
Use one of these formats:
- Standard error: 7.65 ±1.2 (SE)
- Standard deviation: 7.65 ±0.54 (SD)
- Confidence interval: 7.65 (4.0–12.5, 95% CI)
- SE/SD with units: (7.65 ±0.54) J·m⁻² (SD)
- CI with units: 7.65 (4.0–12.5, 95% CI) J·m⁻²
- CI with units (alt): 7.65 J·m⁻² (4.0–12.5, 95% CI)
- CI with units (2nd alt): 7.65 J·m⁻² (4.0 to 12.5, 95% CI)
Dates and times¶
Use RFC 3339; e.g. 2023-11-02T14:55:00-08:00
. Note that the UTC offset is written with a hyphen, not a minus sign. If a timezone is needed, use a Canonical IANA timezone such as America/Los_Angeles
, and set it in square brackets after the UTC offset. For example: 2023-11-02T14:55:00 -08:00 [America/Los_Angeles]
.
Durations and intervals¶
For durations, use 8.3 s. hr
, min
, and sec
/s
are acceptable abbreviations, but M
for minute is not. hh:mm:ss
(e.g. 12:30:55
) is generally ok, but do not use mm:ss
or hh:mm
– these are ambiguous. Also do not use the ISO 8601’s P
/PT
duration syntax (e.g. PT45M55S
) in documentation.
Filesystem paths and trees¶
Always use /
as a path separator in documentation, and denote directories with a trailing /
.
For filesystem trees, use Unicode box-drawing characters. Refer to the research projects guide for an example.
Accessibility¶
Use descriptive titles for link titles.
- ✅ Correct
Refer to the [documentation conventions](documentation.md).
- ❌ Incorrect
Click [here](documentation.md) for conding conventions.
HTML¶
Follow the applicable guidelines from the Markdown section.
Attributes¶
Use kebab-case for id
and name
values and for data
keys and values. Use the alt
attribute for media elements, including <img>
, <video>
, <audio>
, and <canvas>
.
Formatting¶
Use Prettier with default options except for line length, which must be 120. Note that Prettier wraps tags in way that looks strange at first; it does that to avoid adding extra whitespace.
Closing tags¶
Always include the <html>
, <head>
, and <body>
elements. Also, always close tags – for example, use <p>The end.</p>
, not <p>The end.
.
These two practices improve readability and massively simplify parsing. The rules for omitting <html>
, etc. are also complex and better ignored.
Formal grammars¶
Grammars may be specified in any well-defined meta-grammar. Specify the syntax used.
Notable good options:
With ABNF, avoid the incremental alternatives notation (=/
). Because this modifies an already-defined rule, it complicates reading. Also avoid the core rules CHAR
, LWSP
, CTL
, VCHAR
, and WSP
, which are misleading because they are restricted ASCII.
[IEC
]: International Electrotechnical Commission