Definition
A formal system is a precisely specified language together with a finite set of axioms and a finite set of rules of inference, all defined so that proofs can be carried out by manipulating symbols according to their shape alone.
The language fixes which strings of symbols count as well-formed statements. The axioms are the strings designated as starting points. The rules say which strings may be derived from which. Everything inside a formal system is settled by syntax — whether a string is well-formed, whether a step obeys a rule, whether a sequence of strings constitutes a proof — without ever interpreting what the symbols mean. Meaning, when it comes in at all, comes from outside, through a mapping onto some other domain that the system's structure happens to mirror.
Why it matters
How it works
A formal system is a game of shape, not meaning
Strip a formal system to its bones and what remains is a finite alphabet, a recipe for which strings count as well-formed, a set of axioms (designated starting strings), and a set of inference rules (each a pattern that licenses moving from some strings to another). To work inside a formal system you start from axioms, apply rules, and accumulate theorems — strings the rules can reach. The chain of strings that reaches a given theorem is its proof. Crucially, none of this requires understanding what the symbols are about; a child or a computer following the rules will derive the same theorems as a logician. Hofstadter calls this the closed game of symbol-shoving: a formal system is a sealed mechanism whose moves are defined by typography alone.
This shape-only character is what makes formal systems useful. It is also what makes them limited. A formal system has no way to certify that a particular string is not a theorem, because non-derivability is a fact about every possible derivation, and there are infinitely many. From inside, the system can only ever say "yes, here is a proof"; it cannot say "no proof exists."
Meaning enters through isomorphism
The pq-system illustrates how a meaningless game acquires meaning. Its strings use only p, q, and hyphens. Its axioms are every string of the form xp-qx-, and its single rule says: from xpyqz you may derive xpy-qz-. Symbol-shove for an hour and you generate strings like --p-q---. Then, almost unbidden, an interpretation suggests itself: read -- as the integer 2, p as plus, q as equals. Under that reading every theorem of the pq-system decodes to a true addition, and every true addition decodes from some theorem.
The mapping is not arbitrary. Reading p as minus would make --p-q--- decode to the false 2 - 1 = 3. Only mappings that respect the system's rules preserve theoremhood. Meaning is discovered, not stipulated — it is whatever structural mapping happens to make the rules of the system mirror the laws of some external domain. Hofstadter distinguishes passive meaning (a symbol stands for something by convention, like a red light meaning stop) from active meaning (a symbol's role in a system mirrors the role of something elsewhere, so the meaning is intrinsic to the system once the isomorphism exists). Formal systems traffic in active meaning. Mathematics works because the symbol-shuffling of arithmetic is isomorphic to facts about quantity; the rest of the program of formalization extends that bet to ever richer domains.
Working from inside vs reasoning from outside
The MIU-system is the smallest formal system in Gödel, Escher, Bach. Alphabet: M, I, U. Axiom: MI. Four rules let you append U after I, double the body after M, replace III by U, or delete UU. The puzzle: starting from MI, can you derive MU? Try it and you develop a strong suspicion that the answer is no, but no amount of playing the rules will prove that suspicion. The proof requires stepping outside.
The trick is to count Is modulo 3. The axiom MI has one I, so the I-count is 1 mod 3. Rule I adds a U (no change), Rule II doubles the body (so n mod 3 becomes 2n mod 3, taking 1 to 2 and 2 to 1), Rule III turns three Is into a U (subtracts 3, no change mod 3), Rule IV deletes UU (no change). So the I-count modulo 3 can only ever be 1 or 2 — never 0. But MU has zero Is. Therefore MU is unreachable. The argument never produces a single MIU-derivation; it runs entirely in arithmetic about the system. This is the most important distinction in the book: theorem (a string the rules can produce) is not the same as truth about the system (a fact established by reasoning from outside the rules). Inside vs outside is a permanent feature of working with formal systems.
Climbing the expressiveness ladder
Gödel, Escher, Bach presents formal systems as a ladder. MIU is rich enough only for its own puzzle. The pq-system captures addition. The tq-system (axioms of the form xt-q-x, one rule) captures multiplication and lets you read off the composite numbers as exactly those that appear on the right of some theorem. The propositional calculus (topic VII) adds connectives — negation, conjunction, disjunction, implication, biconditional — and inference rules including modus ponens and Hofstadter's "fantasy rule" (conditional proof: open a fantasy with an assumption, derive a consequence, close with an implication). Its theorems are exactly the tautologies, sentences true under every assignment of truth values to the atoms.
Each rung is a more expressive system than the last. The propositional calculus is the last rung that is both sound (every theorem is a tautology) and complete (every tautology is a theorem) — a striking match between syntax and semantics. The next rung, Typographical Number Theory (TNT), adds numerals (0, S0, SS0, …), variables, the operators + and ×, and the quantifiers ∀ and ∃. With these you can write genuine arithmetical statements like "every number has a successor" or "there are infinitely many primes." TNT is essentially Peano arithmetic in string-rewriting form, and it crosses the threshold where the soundness-and-completeness deal breaks.
Self-reference is unavoidable once a system can do arithmetic
Once a formal system is expressive enough to encode arithmetic, an alarming thing becomes possible: the system can talk about itself. The mechanism is Gödel numbering. Assign each TNT symbol a number, then encode a string as a product of prime powers — the first symbol's number is the exponent of 2, the second's the exponent of 3, the third's the exponent of 5, and so on. Every TNT formula becomes a unique number; every TNT proof, being a sequence of formulas, also becomes a number. Statements about TNT formulas (their well-formedness, their derivability) become statements about numbers — which are exactly what TNT itself is about.
Inside TNT you can now define a formula Proof-Pair(m, n) that says "m is the Gödel number of a TNT-proof of the formula with Gödel number n." This is a finite, primitive-recursive condition. From it you build Derivable(n) ≡ ∃m: Proof-Pair(m, n). The final step is a diagonal trick: construct a TNT formula G whose Gödel number g is arranged so that G says, in decoded form, "the formula with Gödel number g is not derivable in TNT" — that is, "I am not provable here." If TNT proves G, then TNT proves something false (since G says it isn't provable), so TNT is inconsistent. If TNT is consistent, then G is not provable — which is exactly what G asserts — so G is true but unprovable. Priest reaches the same destination by a slightly different route, noting that the construction depends on the halting theorem: if arithmetic were completely axiomatizable, you could decide halting, which is impossible.
Generating vs deciding — the asymmetry that lets Gödel in
The figure-and-ground topic of Gödel, Escher, Bach (illustrated with Escher's Mosaic II and Tiling of the Plane with Birds) installs the technical distinction that Gödel's theorem rides on. A set of numbers is recursively enumerable if some procedure, run long enough, eventually lists every member. A set is recursive (decidable) if some procedure, given any candidate, halts in finite time with the right yes/no answer. Every recursive set is recursively enumerable, but not every recursively enumerable set is recursive. Composite numbers are both (you can enumerate them via the tq-system and decide them via trial division). Primes are reachable only as the complement — defined by what they lack — and even though they too happen to be decidable, the topic shows that there are properties that are recursively enumerable yet undecidable. Once such a property exists, no formal system can both generate all instances and decide non-instances. Gödel's theorem is exactly this asymmetry made permanent at the level of arithmetic itself: provability is recursively enumerable, but truth is not.
Priest's compressed version — Hilbert's program and Gödel's verdict
Graham Priest gives the same result in a few pages by focusing on stakes rather than construction. Hilbert wanted to put all of mathematics on a perfectly secure footing: axiomatize each branch, then prove inside a stronger system that the axiomatization is consistent. Gödel killed that program twice. The first incompleteness theorem says that any consistent formal system rich enough to express arithmetic contains a true sentence it cannot prove — the construction is the self-referential sentence G saying "I am not provable here." The second incompleteness theorem adds that such a system cannot prove its own consistency. Priest notes that G is a close cousin of the liar paradox, but with a crucial difference: instead of paradox it produces a permanent boundary. Arithmetic outruns axiomatization, and the only way to extend the reach is to leap to a stronger system — which inherits its own Gödel sentence and its own unprovable truths.
The strange loop arrives
The result is that any sufficiently expressive formal system contains a strange loop: a hierarchy of levels (symbols, formulas, proofs, statements about proofs) that closes back on itself, so the same string of symbols means one thing as arithmetic and another thing about its own provability. Both interpretations apply simultaneously and validly. Hofstadter's Crab Canon dialogue stages this structure in prose — characters describing a Bach piece, an Escher print, and a DNA palindrome whose shared structure is the dialogue's own shape — so the reader sees self-reference happen without the word "self" ever appearing. The same indirection lets Gödel's sentence refer to its own provability by talking about numbers. Strange-loopiness is not a curiosity of TNT; it is the structural cost of being a formal system rich enough to matter.
History and the limits of the formal method
Logic's machinery accumulated in three bursts: Aristotle's syllogistic, the medieval scholastics, and the 19th–20th-century explosion that ran from Boole to Frege to Russell to Gödel and Turing. The formal-system idea sits at the end of that third burst. After 1931 it became clear that the formal method has both unprecedented power — propositional calculus is mechanizable, arithmetic is axiomatizable up to a point, computation itself can be formalized — and unavoidable limits. Soundness, completeness, decidability, consistency: each is a precise property that some formal systems have and others necessarily lack. The shift from informal reasoning to formal systems did not make logic safe; it made the boundaries of safety mathematically visible.
Two further results bolt the formal-system idea directly onto computer science. The Church–Turing thesis identifies "decidable by a formal procedure" with "computable by a Turing machine," so the question can this be decided inside a formal system? becomes the question can this be computed? — and Gödel's undecidability and Turing's halting problem are then two faces of one limit. The Curry–Howard correspondence runs the bridge the other way: it shows that proofs in a formal system and programs in a typed language are literally the same object viewed from two angles — a proposition is a type, a proof is a program inhabiting it, and proof normalization is program execution. This is why type systems and proof assistants are formal systems in the strict sense, and why their limits are the limits of formal systems generally.