Starting A Project
5 min read
Core idea
A shell script is not designed once and typed in finished form. It grows in runnable stages, where every stage is a working program — just less capable than the next one. The starting move is to write the smallest version that produces some correct output, run it, and only then add the next slice. The discipline that makes this work is treating every literal value that might appear in more than one place as a candidate to be promoted to a named constant or variable, and treating every multi-line block of text as a candidate for a here document.
Author's framing: Programs are usually built up in a series of stages, each adding features. Maintenance is easier when a program is easy to read — and easier to write because there is less to type.
Stage 1 — Print something correct
The first version of the running example is a sequence of echo lines that emit a valid HTML skeleton to standard output. It has no data, no logic, no functions. It does exactly one job: produce a well-formed document. That is enough to verify the shebang, the executable bit, the path, and the redirection plumbing all work end to end. Subsequent stages can lean on that foundation.
Stage 2 — Collapse repetition with quoted strings
A run of single-purpose echo commands can be folded into one echo whose argument is a multi-line quoted string. The shell keeps reading until the closing quote — including across newlines. The script gets shorter, but more importantly, adding the next line of output now means editing one place instead of inserting a new command.
Stage 3 — Name the values that will change
When the title System Information Report starts appearing in two places, that is the signal to promote it to a variable. Once named, the value lives in one declaration. Any future rename touches one line, not many. The shell creates variables implicitly on first assignment — a convenience that doubles as a hazard, because a misspelled name silently expands to nothing.
Stage 4 — Distinguish constants from variables
A value that never changes during a run (the page title, the report version, an HTML token) is conceptually a constant, even though bash makes no syntactic distinction. The convention is to write constants in UPPERCASE and variables in lowercase, signalling intent to the reader. declare -r can enforce immutability when it matters; in practice the naming convention does most of the work.
Stage 5 — Reach for here documents when text blocks dominate
When the script's output is mostly literal text with a few interpolations, switching from echo to a cat <<_EOF_ ... _EOF_ here document removes the burden of escaping every embedded quote. The shell still expands $VAR and $(cmd) inside the heredoc, so the script can stay templated. Quoting the opening token (<<"_EOF_") turns off all expansion when literal output is needed.
Why it matters
The reason to develop a script in named, runnable stages — rather than writing the whole thing and then debugging it — is that most bugs come from changes, and the smaller the change, the faster the bug is found. Empty-variable expansion, misspelled names, ambiguous parameter boundaries, and forgotten quotes are all easier to localise to "the one thing I did in the last five minutes" than to "somewhere in the 200 lines I just wrote."
It makes silent failures loud
The shell does not warn about a misspelled variable. $fool is the empty string, and the empty string passes through cp $foo $fool as a missing argument — silently broken. A staged build catches this within seconds of writing the new line, because the previous stage was known to be working. Without staging, the same bug can lurk for hours.
It pays for itself when scripts grow
A 30-line script written in one shot is fine. A 300-line script written in one shot is a swamp. The habits formed at 30 lines — quote your variables, name your constants, prefer heredocs for long output — are what keep the 300-line script readable. They are also the habits that make code from one script paste cleanly into another.
Key takeaways
Mental model
Practical application
-
Write the shebang and one
echo. Save,chmod +x, run. If you do not see the output, your shell, path, or executable bit is the problem — fix that before writing line two. -
Add output one block at a time. Each block ends with a manual run. Do not write three blocks and then run — write one, run, write the next.
-
Promote on the second occurrence. The moment a literal string appears in a second location, replace both occurrences with
$VARand addVAR="value"at the top. Run it again; the output should be byte-for-byte identical. -
Quote every expansion.
echo "$title"notecho $title. The only time to omit quotes is when you want word-splitting, which is rarer than people think. -
Heredoc the long blocks. If a single
echoargument is more than five lines of text, swap tocat <<_EOF_ ... _EOF_. The diff that introduces the swap should not change the output. -
Resist adding logic. Variables and heredocs first; conditionals and loops only after the linear version reads cleanly.
Example
Imagine you are scripting the daily backup digest your team posts to a status channel. The first version is six echo lines hard-coding hostname, date, and disk usage. It works. On day two you decide the heading should read "Backups for $HOSTNAME — $(date +%F)" everywhere, so you promote those to a TITLE constant. On day three you realise the digest body is mostly literal Markdown with two interpolations, so you switch the body to a here document keyed on _EOF_. On day four someone runs the script from cron, where HOSTNAME is unset; the title silently renders as " — 2026-05-26", which is the kind of bug staging catches immediately. The fix is one line — default it: TITLE="Backups for ${HOSTNAME:-unknown} — $(date +%F)".
The script that was a flat list of echo calls on day one is, by day four, a parameterised, here-documented, defended template — and every change between day one and day four was small enough to verify in under thirty seconds.
Related lessons
Related concepts
- Bash Scriptinglinked concept