Pragmatic Paranoia

9 min read

Core idea

Perfect software does not exist — design as if you know it

Thomas and Hunt open with a flat axiom: nobody in the history of computing has shipped perfect software, and you will not be the first. The Pragmatic Programmer accepts this not as a defeat but as a design input. If your code, your colleagues' code, your dependencies, and your runtime are all going to make mistakes, the discipline is to build systems that fail well — early, loudly, locally — instead of failing late, silently, and catastrophically.

The five topics in this topic are coordinated techniques for doing that. Design by Contract pushes correctness conditions into the type system or the interface. Dead Programs Tell No Lies says: when reality contradicts your assumptions, stop running, don't keep going. Assertive Programming says: write checks for the things you "know" are true. Resource Balancing says: every acquisition has a release, and you are responsible for both. Don't Outrun Your Headlights says: take small steps so the cost of being wrong stays small.

Pragmatic paranoia is defensive, not paranoid

The topic's tone is important. The authors are not asking you to validate every variable seventeen times. They are asking you to design contracts up front, to enforce them at boundaries, to fail fast when they're violated, and to take steps no bigger than the next checkpoint you can verify. The result is paradoxical: paranoid code is usually less code than fearful code, because the boundaries are explicit and the checks live in one place.

Tip 36: You Can't Write Perfect Software.

Why it matters

Bugs compound when they're hidden

The cheapest bug is one that crashes on line 50 of an empty test case. The most expensive is one that silently corrupts a database for nine months before anyone notices. Every technique in this topic is a way to convert the second kind of bug into the first kind. A contract crashes the moment a caller violates a precondition. An assertion crashes the moment an invariant breaks. A try/finally around a resource crashes (or releases cleanly) at the boundary you expected. Small steps mean the surface area between your last verified state and your next failure is small.

Most "production incidents" trace to one of these five disciplines

Production postmortems often read like a roll call of this topic. Service crashed because nobody validated input from an upstream — Design by Contract. Service ran for hours in a corrupt state because nobody stopped on the first signal — Dead Programs. Resource leak under load — Balance Resources. Big-bang deploy that took down half the stack — Outrun Your Headlights. Pragmatic Paranoia is essentially the SRE handbook, written in 1999.

Key takeaways

The topics in this topic

Topic 23 — Design by Contract

Borrowed from Bertrand Meyer's Eiffel work, Design by Contract (DBC) treats every function as a small commercial transaction. The function makes the caller a promise — "if you give me X, I'll give you Y" — and both parties are bound by it.

  • Preconditions — what the caller must guarantee before the function will run. (Amount must be positive; account must exist.)
  • Postconditions — what the function guarantees when it returns. (A new transaction will be visible on the account.)
  • Invariants — properties that always hold across the lifetime of the object or module. (Account balance is never negative.)

Tip 37: Design with Contracts.

When a precondition is violated, the caller has the bug. When a postcondition fails, the callee has the bug. This is not a stylistic distinction — it tells you where to look. Languages with first-class DBC (Eiffel, Clojure's pre/post, Elixir's guard clauses) let you enforce contracts at the function signature. Languages without it can still benefit from comments, runtime assertions, or unit tests that encode the contract.

The authors put DBC against TDD and defensive programming and conclude it does something neither does: it forces you to think about correctness before you write code, in the vocabulary of the interface, not in the vocabulary of an example. DBC is "lazy" code in the best sense — the function does the bare minimum and demands the bare minimum, and the boundary is unambiguous.

Topic 24 — Dead Programs Tell No Lies

When your code detects something that should be impossible, the correct response is almost always to stop immediately. The instinct is to try to recover, to log and continue, to "be resilient." Thomas and Hunt argue that this instinct is usually wrong: a program in an impossible state cannot reliably do anything, including recover. Continuing turns a contained bug into corrupted data, corrupted output, downstream failures, and a debugging nightmare.

Tip 38: Crash Early.

The principle has a few corollaries.

  • Check return values. Even from functions you "know" won't fail. The day they fail is the day a silent ignore makes the next ten minutes incomprehensible.
  • Prefer raising exceptions to returning sentinel values in languages that support it. Sentinels get ignored; exceptions propagate.
  • Don't catch and swallow. If you catch an exception, do something meaningful with it — log it with context, rethrow, or convert to a domain error. A bare catch that does nothing is a future production incident.

The phrase the authors return to: a dead program causes much less damage than a maimed one. The maimed program keeps running and quietly destroys things.

Topic 25 — Assertive Programming

Assertions are runtime checks for things you "know" to be true. The day an assertion fires is the day you discover you were wrong about something fundamental — and the assertion is what made it cheap to discover. The authors are emphatic on two points.

Tip 39: Use Assertions to Prevent the Impossible.

Don't use assertions for things that can happen. User input validation, network errors, missing files — these are part of the normal universe of behavior and deserve real error handling. Assertions are for the impossible: invariants you believe hold, postconditions you believe your code maintains, properties of internal state that should never be violated.

Don't disable assertions in production. The temptation is real — assertions cost a few cycles, and production code should be fast. The authors push back hard: production is exactly where you most need to know if an impossible thing happened, and the alternative (a silent corruption that surfaces three hours later as a customer support ticket) is far more expensive than the microseconds saved.

Tip 40: Don't Use Assertions in Place of Real Error Handling.

A well-asserted codebase reads like a contract written into the body of the function. Every assert balance >= 0 is a line of documentation that the compiler enforces.

Topic 26 — How to Balance Resources

Every resource you acquire — memory, file handle, lock, socket, database connection, transaction — must be released. The bug pattern is universal: code that allocates and forgets, or allocates and crashes between allocation and release.

Tip 41: Finish What You Start.

The authors' rule of thumb: whoever acquires releases. Don't expect the caller to clean up something you allocated; don't allocate something inside a callee that the caller can't see. Keep the lifetime as small and as visible as possible.

The mechanical aids exist in every language:

  • C++ RAII (destructors run when objects leave scope).
  • Java/C#'s try-finally, plus try-with-resources.
  • Python's with statement and context managers.
  • Go's defer.
  • Rust's borrow checker, which makes this discipline structural.

Tip 42: Act Locally.

Use them religiously. The discipline is to make the release mechanical, not voluntary — because voluntary release will eventually be forgotten under deadline pressure. Code that closes resources via defer or try-with-resources doesn't have leak bugs even if the developer didn't think about it.

Special cases worth thinking about: nested resources should be released in reverse order of acquisition (open A, open B, close B, close A); shared resources need ownership rules (who's responsible for the final release?); and long-lived resources (connection pools, caches) need explicit lifecycle management.

Topic 27 — Don't Outrun Your Headlights

You drive at night only as fast as your headlights let you see — because anything further out is unknowable, and slamming the brakes from 60mph for a deer you couldn't see is too late. Code the same way. Take small steps, each one verifiable, each one a checkpoint you can return to.

Tip 43: Take Small Steps — Always.

In practice this means:

  • Commit small. A working unit of change you can revert.
  • Test small. After every change, run the tests that prove the system still works.
  • Deploy small. Smaller deploys mean smaller blast radii when they go wrong, and faster rollbacks.
  • Forecast small. Estimate the next concrete deliverable, not the project endpoint.

Tip 44: Avoid Fortune-Telling.

The authors' deeper warning: avoid trying to predict the future in your code. Speculative generality — "we might need this someday, so let's build it in now" — is the most common form of outrunning your headlights. The someday rarely comes; the speculation usually points the wrong direction; the code complicates everything in the meantime. Build what the current step needs. When the next step arrives, build for that.

Mental model

Mental model

Practical application

A code-review checklist

When reviewing your own code or someone else's, walk through these five questions:

  1. What are the preconditions, postconditions, and invariants of this function? If they're not stated (in code, comments, or tests), the contract is implicit — and implicit contracts get violated.
  2. What happens when something impossible is detected? Look for silent catches, ignored return values, "shouldn't happen" comments without an assertion.
  3. Are the invariants asserted? A function that maintains an invariant should check the invariant at its boundaries.
  4. Are all resources cleaned up on every exit path, including exceptions? Look for raw open/close pairs without try/finally or equivalent.
  5. Is each change small enough that I could revert it cleanly? Big PRs are headlights-outrunning by definition.

A PR that passes all five is hard to break in production. A PR that fails one is a future incident waiting for a trigger.

Contracts at the boundary, not everywhere

A practical refinement: enforce contracts hardest at the boundaries of your system — public APIs, network ingress, database reads, user input. Inside trusted boundaries, contracts can be lighter, because the data has already been vetted. The mistake is to validate everywhere or nowhere. The right answer is to validate once, where untrusted data crosses into trusted territory, and trust the rest.

Example

The payment processor that failed well

Imagine a payment processor service. A chargeCard(amount, cardToken) function lives behind an internal API.

  • Contract: amount > 0, cardToken is a valid token (preconditions). On success, returns a transaction with status succeeded (postcondition). The aggregate amount_charged_today for the card never exceeds the daily limit (invariant).
  • Dead program: If the underlying card-network response contains a status not in the enum (succeeded, declined, error), the service does not invent a fallback — it raises UnknownStatusError and a circuit breaker opens. Better to fail fast than to silently treat an unknown status as either success or failure.
  • Assertion: Before returning, the function asserts transaction.amount == amount. This catches a class of bug where rounding errors or currency conversions silently mutate the charge.
  • Resource balancing: The DB transaction and the external HTTP call are both wrapped in try/finally (or with) blocks so that any exception releases the connection and rolls back the transaction.
  • Small steps: The feature ships behind a feature flag, enabled first for 1% of traffic, then 10%, then 100%, with metrics watched at each step.

The day the card network returns a new status code nobody anticipated, the service crashes loudly on one request, the circuit breaker opens, the on-call engineer gets paged, the bug is found in twenty minutes. The same code without pragmatic paranoia would have silently treated the unknown status as a success, double-charged customers for a week, and produced a six-figure refund campaign. The five disciplines don't prevent the surprise — they bound the damage.

Continue exploring

Tags