Fundamentals
5 min read
Core idea
Probability is the formal grammar of uncertainty. Before any calculation can begin, you have to nail three things down: the sample space (every distinct outcome the experiment can produce), the events (subsets of that space whose chance you care about), and the axioms a probability function must obey. Once those are fixed, the actual numerical answer depends on which interpretation you adopt — but every interpretation lives inside the same axiomatic frame.
Author's argument: Probability does not defy common sense. The famous paradoxes only feel paradoxical because their hidden assumptions about the sample space were never made explicit. Flush those assumptions out and the answers fall into line.
The three living interpretations — classical (count symmetries), frequentist (measure long-run rates), and subjective Bayesian (state a degree of belief) — disagree about what a probability number means, but they agree on how probabilities must combine. That shared algebra is what makes probability theory a single discipline rather than three competing ones.
Why it matters
Almost every claim that begins "the probability is..." is doing concealed work. A drug company saying "85% effective" is doing frequentist work over a clinical trial. A weather forecaster saying "30% chance of rain" is making a subjective Bayesian assertion about this atmosphere on this day. A casino quoting house edge is doing classical counting on a symmetric sample space. Mistaking which interpretation is in play is the single most common way that probability claims get misused — and the most common way that intuition leads people astray.
The hidden-assumption problem
Most "trick questions" in probability (Monty Hall, the two-children paradox, the birthday problem) are not really about counting. They are about which sample space the question implicitly assumes. Once you write the sample space down explicitly, the so-called paradox dissolves. The discipline of naming your sample space before you reach for a fraction is the single most valuable habit this topic teaches.
The cost of skipping the foundations
Skipping straight to combinatorial formulas — "n choose k", "permutations of r from n" — produces students who can compute but cannot diagnose. They can grind out the answer to a textbook problem but cannot tell you why the lottery isn't a 50-50 proposition just because there are two outcomes (win, lose). The fundamentals topic exists to install the diagnostic vocabulary first.
Key takeaways
Mental model
The three interpretations are not rivals to be ranked but tools for different shapes of question. The diagram below maps each interpretation to the kind of evidence it draws on and the kind of question it answers best.
Practical application
Step 1 — Name the sample space explicitly
Before reaching for a formula, write down the full list of possible outcomes. For two coin flips, that is {HH, HT, TH, TT} — four outcomes, not three. The temptation to write {0 heads, 1 head, 2 heads} looks sensible but smuggles in unequal probabilities. The classical "count favourable over total" rule only works when every entry in your list is equally likely.
Step 2 — Decide which interpretation is in play
Ask: is this a problem of symmetry, a problem of repeatable measurement, or a problem of one-off judgement? A fair die calls for classical. A clinical trial calls for frequentist. The chance a particular CEO will resign next quarter calls for subjective Bayesian — there is no symmetric sample space and the experiment cannot be repeated.
Step 3 — Check that your number obeys the axioms
A probability you assigned is malformed if it is negative, greater than one, or assigns probabilities that do not add up correctly across disjoint events. This sounds trivial but catches an enormous number of intuitive errors. People routinely claim "there is a 70% chance of A and a 60% chance of B" for mutually exclusive A and B without noticing that the two numbers already exceed 1.
Step 4 — Use counting tools correctly
For finite sample spaces, the combinatorial workhorses are permutations (order matters) and combinations (order does not). The 1,326 two-card hands in a 52-card deck come from C(52, 2) = 52 * 51 / 2. The 64 "blackjack" hands come from 4 aces * 16 ten-cards. The probability of blackjack is then 64 / 1326, just under 5%. Every classical-probability answer is a fraction whose numerator and denominator are both counting problems.
Example
Suppose a small startup is trying to decide whether to launch a new product. The CEO says, "I think there's an 80% chance this succeeds." The CFO says, "Historical data from our last twelve launches shows we succeed about 40% of the time." A junior analyst, simulating an idealized market, calculates "by symmetry there are roughly four equally likely market reactions, three of which we'd call success, so the probability is 75%."
All three numbers are well-formed probabilities. They obey the axioms. But they come from three completely different interpretations:
- The CEO is doing subjective Bayesian reasoning: an expressed degree of belief, presumably informed by private knowledge of the team, the market, and the product.
- The CFO is doing frequentist reasoning: long-run rate over a series of comparable trials. Note the implicit assumption that this launch is exchangeable with the previous twelve.
- The analyst is doing classical reasoning: counting outcomes on a symmetric model. The honest question is whether the four "market reactions" really are equally likely, or whether the analyst quietly fudged the symmetry.
The right response is not to pick a winner. It is to recognise that the three numbers answer three different questions and triangulate. If the CFO's frequentist rate is 40% but the CEO's belief is 80%, the gap is the value of the private information the CEO claims to have. If that information cannot be articulated, the credence should regress toward the base rate. Probability gives you the algebra to combine these views (Bayes' theorem, covered in later topics), but only if you first identify which interpretation each number came from.
Why the interpretations cannot be merged
A frequentist 40% is not "the same as" a Bayesian 80% that happens to differ in value. They make different types of claim. The 40% is a property of the long-run process; the 80% is a property of the speaker's epistemic state. You can update the 80% with the 40% as evidence, but you cannot average them — they are not on the same conceptual scale, even though they share a number line.
Related lessons
Related concepts
- Probabilitylinked concept
- Sample Spacelinked concept
- Probability Axiomslinked concept
- Classical Probabilitylinked concept
- Frequentist Probabilitylinked concept
- Bayesian Probabilitylinked concept