Scripts and build files¶
Command-line tools¶
Standard streams¶
As long as doing so sensible, CLI tools should read from stdin and write to stdout. In either cases, they must direct logging messages to stderr, never to stdout.
Arguments¶
Prefer named options over positional arguments, and limit positional argument lists to ≤ 3 items. Arbitrary-length lists are fine as long as all positions (1–∞) carry the same meaning. This table summarizes:
- ✅
<tool> [<options>]
- ✅
<tool> [<options>] <arg-1>
- ✅
<tool> [<options>] <arg-1> <arg-2>
- ✅
<tool> [<options>] <args...>
- ❌
<tool> [<options>] <arg-1> <args...>
(positional and varargs) - ❌
<tool> [<options>] <arg-1> <arg-2> <arg-3> <arg-4>
(> 3 positional args)
Allow named options both before and after positional arguments.
Use -
for single-character options and --
for long ones (e.g. --user
). Allow both --option <arg>
and --option=<arg>
.
Use standard option names:
--help
--version
(output the version to stdout and exit)--verbose
(and optionally-v
)--quiet
(and optionally-q
)
Bash¶
Rationale
- It’s more portable, and it allows callers to choose the specific Bash.
- This allows the script to be used in a pipe. (Note: Shell bulletins use
2
to indicate misuse.) function my_fn {}
is effectively deprecated, non-portable, and never necessary. Note thatfunction my_fn() {}
is technically invalid in all major shell, though most will tolerate it.- It’s more clear, and it prevents bugs like the one illustrated on the last line below.
Use bertvv’s Bash guidelines alongside the following rules and exceptions.
- As the only exception:
$var
is preferred over${var}
where either would work. - Use
#!/usr/bin/env bash
. - Read stdin where applicable, and reserve stdout for machine-readable output. For usage errors: write the usage to stderr;
exit 2
. For--help
: write the usage and program description to stdout;exit 0
. - Use
my_fn() {}
, notfunction my_fn {}
, and definitely notfunction my_fn() {}
. - Always explicitly
exit
. - Use
printf
, notecho
.
Note
I recently (2024-08) changed my mind on $var
vs. ${var}
. I previously followed bertvv’s advice and used forced ${}
. That adds clarity, but it contradicts my less > more principle.
Example¶
#!/usr/bin/env bash
# SPDX-FileCopyrightText: Copyright 2024, Contributors to the gamma package
# SPDX-PackageHomePage: https://github.com/the-gamma-people/gamma
# SPDX-License-Identifier: MIT
set -o errexit -o nounset -o pipefail # (1)!
declare -r prog_name=$(basename "$0")
declare -r -i default_min=2
_usage="Usage: $prog_name in-dir [min-hits=$default_min]"
_desc="Computes gamma if in-dir contains < min-hits results."
if (( $# == 1 )) && [[ "$1" == "-h" || "$1" == "--help" ]]; then
printf '%s\n%s\n' "$_desc" "$_usage"
exit 0
fi
if (( $# == 0 )) || (( $# > 2 )); then
>&2 printf 'Invalid usage.\n%s\n' "$_usage" # (2)!
exit 2 # (3)!
fi
declare -r in_dir="$1"
declare -r -i min_hits=$(( "${2:-}" || $default_min ))
gamma::fail() { # (4)!
>&2 printf '[FATAL] %s\n' "$1"
exit 1 # (5)!
}
gamma::must_compute() {
_count=$(( ls -l -- "$in_dir" | wc -l ))
return (( _count < $min_hits ))
}
gamma::compute() {
ping google.com || fail "Cannot ping Google. Check cables?"
}
gamma::main() {
>&2 printf '[INFO] Initializing...\n'
gamma::must_compute && gamma::compute
}
gamma::main
exit 0 # (6)!
- Set
nounset
,errexit
, andpipefail
. WithIFS=$'\n\t'
, these options are sometimes called Bash Strict Mode. UsingIFS=$'\n\t'
is generally safer, but it’s less interoperable. - Always log to stderr.
exit 2
for usage errors.- Omit
function
. Using a package prefix (e.g.gamma::
) can be helpful. exit
with 1 or 3–125. Seesysexits.h
(ignoringEX_USAGE
).- Without this line, the script returns
1
ifgamma::must_compute
returns1
. If usingmust_compute
’s exit code is desired, useexit $?
.
Don’t rely on Strict Mode
Even if you set noset
, errexit
, and pipefail
, write the code as though they’re not enabled. Having to switch between strict and nonstrict styles is difficult, and you’re likely to make a mistake. Practicing the robust way only reinforces it in your memory (and helps others reading it, too). And code that relies on these options are brittle when moved or copied to a script (or shell) that doesn’t set them. They can even get unset midway through a script; e.g. when source
ing another script.
Parsing command-line arguments¶
Either keep it simple as in the above example, or use a case
statement. See todos.sh
for a case
example.
Rationale
getopts
, as well the variant of getopt
on BSD, cannot parse –long-style options. getopt
has other differences between distributions, as well. This makes case
statements a substantially better choice.
Docker¶
Consider using a linter such as hadolint.
ENV
commands¶
Break ENV
commands into one line per command. ENV
no longer adds a layer in new Docker versions, so there is no need to chain them on a single line.
Labels¶
Use Open Containers labels, including:
org.opencontainers.image.version
org.opencontainers.image.vendor
org.opencontainers.image.title
org.opencontainers.image.url
org.opencontainers.image.documentation
Multistage builds¶
Where applicable, use a multistage build to separate build-time and runtime to keep containers slim. For example, when using Maven, Maven is only needed to assemble, not to run.
Here, maven:3-eclipse-temurin-21
is used as a base image, maven is used to compile and build a JAR artifact, and everything but the JAR is discarded. eclipse-temurin:21
is used as the runtime base image, and only the JAR file is needed.