Concept

Probability

Definition

Probability is a numerical measure, on a scale from 0 to 1, of how likely an event is to occur — 0 for impossible, 1 for certain, and every value in between expressing a graded degree of expectation. It is the formal grammar of uncertainty: a way to give chance a number so that statements about the unknown can be combined, compared, and reasoned about with the same rigor as statements about the known.

The same number can be reached three different ways — by counting symmetries on a fair die, by measuring long-run frequencies from data, or by stating a personal degree of belief — and the three interpretations disagree about what the number means while agreeing on how numbers must combine. Probability is therefore both a mathematical object (a measure on a sample space, obeying the Kolmogorov axioms) and a philosophical one (a contested account of what it means for the future to be uncertain).

Why it matters

How it works

The axiomatic core

Every probability problem starts with a sample space — the complete catalog of distinct outcomes the situation can produce — and a set of events, which are subsets of that space. A probability function assigns a number between 0 and 1 to each event, subject to three rules from Kolmogorov: probabilities are non-negative, the entire sample space has probability 1, and the probability of any disjoint union is the sum of the parts. From those three rules alone, every other formula in the field can be derived: the complement rule (pr(not A) = 1 - pr(A)), the inclusion-exclusion rule for disjunctions (pr(A or B) = pr(A) + pr(B) - pr(A and B)), and the multiplication rule for conjunctions through conditional probability.

This minimal scaffolding is what unifies the discipline. Coin flips, radioactive decay, queues at a hospital, fluctuations in a stock price, and the reliability of a bridge component all live in different sample spaces, but they share the same algebra. As Haigh puts it in his Very Short Introduction, naming the sample space before reaching for a fraction is the single most valuable habit a student of probability can install — most "trick questions" in the field are not really questions about counting, they are questions about which sample space the wording implicitly assumes.

Three interpretations, one algebra

Probability is unusual among mathematical fields in that its central object has three competing philosophical interpretations, none of which has won the argument. Classical probability, the oldest, counts symmetric outcomes — favorable cases over total cases — and applies cleanly to dice, decks of cards, and lotteries where the sample space has obvious symmetries. Frequentist probability defines pr(A) as the limit of the relative frequency of A in a long sequence of repeatable trials; it is the interpretation behind clinical trial results, manufacturing defect rates, and insurance actuarial tables. Subjective Bayesian probability treats pr(A) as a personal degree of belief, constrained only by the axioms and by updating in light of evidence; it is the interpretation behind a weather forecaster's "30% chance of rain" and a court's assessment of a defendant's guilt.

The three interpretations frequently produce the same numerical answer, but they disagree about what that number is. When a startup CEO says "I think there's an 80% chance this product succeeds," a CFO replies "historical launch data puts us at 40%," and a junior analyst calculates "by symmetry there are four equally likely market reactions and three count as success, so 75%" — these are three different questions, not three competing answers. The right response is not to pick a winner but to triangulate: the gap between the CEO's subjective credence and the CFO's frequentist base rate is the value of the private information the CEO is claiming to possess, and if that information cannot be articulated the credence should regress toward the base rate.

Conditional probability and inductive validity

The decisive notion for reasoning is conditional probability, written pr(A | B) — the probability of A given that B holds. It is computed by restricting attention to the cases where B is true, then asking what fraction of those also have A: pr(A | B) = pr(A and B) / pr(B), undefined when pr(B) is zero. Conditional probability is what lets evidence move belief — it is the formal machinery behind every diagnostic test, every legal inference, and every Bayesian update.

In Graham Priest's Logic: A Very Short Introduction, conditional probability is also what rescues inductive reasoning from its weakness relative to deduction. A deductive inference guarantees its conclusion if the premises hold; an inductive inference cannot. But Priest gives induction a precise standard: an inference is inductively valid just when the conditional probability of the conclusion given the premises is greater than the conditional probability of its negation given the same premises. Sherlock Holmes's famous "deductions" are really inductions in this sense — a worn cuff makes "writes a lot for a living" more probable than not, conditional on the cuff-wear, even though it does not make it certain. Probability is the tool that turns Holmes's pattern-matching into something with a rule attached.

The reference-class problem

Every probability statement is implicitly relative to a class of cases, and the choice of class can change the number dramatically. A screening test reported as "90% accurate" is 90% accurate within some class — everyone screened, the symptomatic, a particular age band — and each class yields a different probability that a positive result means real disease. Priest's topic on probability pushes this observation to its limit. The narrowest, most specific reference class for any individual is the class containing only that individual — but then either the person has the condition or they do not, so the probability collapses to 1 or 0, and the inference can no longer be used to discover whether the condition is present. Pushed to its logical extreme, the reference-class problem threatens to make inductive validity collapse into uselessness.

Practical probability lives with this tension by accepting that the reference class is a modeling choice, not a fact. The discipline is to state the class out loud, defend the choice against alternatives, and treat probability statements as conditional on that choice. A probability without a named reference class is a number without units.

Combinatorics and the structure of finite sample spaces

For finite sample spaces, classical probability reduces to counting. The two combinatorial workhorses are permutations (where order matters) and combinations (where it does not). The 1,326 distinct two-card hands in a 52-card deck come from C(52, 2) = 52 × 51 / 2. The 64 "blackjack" hands come from 4 aces × 16 ten-cards. The probability of being dealt a blackjack is therefore 64 / 1326, just under 5%. Every classical-probability answer is a fraction whose numerator and denominator are both counting problems.

This is why combinatorics and probability are usually taught in the same topic: once the sample space is specified, the calculation reduces to enumeration. The bookkeeping can be elaborate — multi-stage experiments, conditional sub-spaces, sampling with and without replacement — but the underlying move is always the same: count the favorable outcomes, count the total, divide.

Probability as a quantitative lens for general reasoning

In The Great Mental Models, Volume 3, probability appears not as a calculation tool but as one of nine quantitative shapes the mind should learn to recognize. Alongside compounding, regression to the mean, the destructive power of multiplying by zero, and the topography of distributions, probability is treated as an intuition to be installed — a default lens for situations where the numbers are messy, missing, or unreliable.

The argument is that most reasoning errors are quantitative errors wearing other disguises. When you confuse an anecdote for evidence, you are misusing the implicit sample. When you mistake a hot streak for permanent improvement, you are forgetting regression to the mean. When you assume that ten years of preparation will pay back linearly, you are missing compounding. People who appear to be unusually good with numbers are rarely faster at arithmetic — they are faster at recognizing which shape a situation has, and therefore which quantitative intuition applies. Probability, in this framing, is less about computing the chance of an event than about noticing when a problem is governed by chance in the first place.

Distributions, expectations, and the law of large numbers

Once probabilities are attached to outcomes, random variables turn outcomes into numbers and distributions describe how those numbers spread. The headline distributions — binomial for counts of successes, Poisson for counts of rare events in time, normal for sums of many small independent contributions — are the workhorses behind nearly every applied use of probability, from quality control to clinical trials to financial risk models. The expected value of a random variable is its long-run average weighted by probability; the variance measures how spread out it is around that average.

The law of large numbers says that as the number of independent trials grows, the observed average converges to the expected value — which is why insurance works, why casinos profit from house edges over time, and why one experiment is rarely enough to settle a scientific question. The central limit theorem says that the sum of many small independent random contributions is approximately normally distributed, which is why the bell curve appears everywhere from measurement errors to heights of trees to test scores. Together these two results form the bridge from probability (reasoning forward from models to data) to statistics (reasoning backward from data to models).

Where it goes next

Bayes Theoremlinked concept
Conditional Probabilitylinked concept
Expected Valuelinked concept
Law of Large Numberslinked concept
Probability Axiomslinked concept
Random Variablelinked concept
Reference Class Problemlinked concept
Sample Spacelinked concept
Inductive Reasoninglinked concept
Base Rateshares tag: decision-making
Bayesian-Frequentist Debateshares tag: bayesian
Decision Under Uncertaintyshares tag: decision-making
Distributionsshares tag: probability
Frequentist Probabilityshares tag: frequentist
80/20 Ruleshares tag: decision-making
Bayesian Probabilityshares tag: bayesian
Clinical Trialshares tag: probability
Confidence Intervalshares tag: probability
Cost-Effectivenessshares tag: decision-making
Data Literacyshares tag: decision-making
Discrete Datashares tag: probability
Dollar Streetshares tag: decision-making
Doubling Lineshares tag: decision-making
Epidemiologyshares tag: probability
Failure Rateshares tag: probability
Fat-Tailed Distributionsshares tag: probability
Frightening vs Dangerousshares tag: decision-making
History of Probabilityshares tag: mathematics
Income Levelsshares tag: decision-making
Level vs Directionshares tag: decision-making
Lonely Numbershares tag: decision-making
Majority Trapshares tag: decision-making
Mutually Exclusiveshares tag: probability
Peak Childshares tag: decision-making
Per Capita Ratioshares tag: decision-making
Percentageshares tag: mathematics
Pollingshares tag: probability
Population Projectionshares tag: decision-making
Precautionary Principleshares tag: decision-making
Probabilistic Thinkingshares tag: decision-making
Randomisationshares tag: probability
Randomnessshares tag: probability
Regression to the Meanshares tag: probability
Risk Aversionshares tag: decision-making
Risk Calculationshares tag: decision-making
S-Curveshares tag: decision-making
Sampling Distributionshares tag: probability
Simpson's Paradoxshares tag: probability
Size Instinctshares tag: decision-making
Slow Changeshares tag: decision-making
Small Stepsshares tag: decision-making
Statistical Inferenceshares tag: probability
Straight Line Instinctshares tag: decision-making
Action Biasshares tag: decision-making
Actuarial Scienceshares tag: probability
Algorithmshares tag: mathematics
Anchoringshares tag: decision-making
Anchoring Biasshares tag: decision-making
Attention Filtershares tag: decision-making
Attributionshares tag: decision-making
Avoiding Failureshares tag: decision-making
Bar Chartshares tag: statistics
Base Rate Fallacyshares tag: probability
Bayesian Updateshares tag: probability
Behavioral Economicsshares tag: decision-making
Belief-Driven Behaviorshares tag: decision-making
Belief Systemshares tag: decision-making
Binary Thinkingshares tag: decision-making
Binomial Distributionshares tag: probability
Birthday Paradoxshares tag: probability
Blame Instinctshares tag: decision-making
Bounded Rationalityshares tag: decision-making
Burning the Bridgesshares tag: decision-making
Calculationshares tag: decision-making
Cantor's Diagonal Argumentshares tag: mathematics
Category Errorshares tag: decision-making
Causationshares tag: statistics
Center of Gravityshares tag: decision-making
Central Limit Theoremshares tag: probability
Central Tendencyshares tag: statistics
Chemophobiashares tag: decision-making
Choice Architectureshares tag: decision-making
Church-Turing Thesisshares tag: mathematics
Circle of Competenceshares tag: decision-making
Classical Probabilityshares tag: probability
Cognitive Biasshares tag: decision-making
Cognitive Loadshares tag: decision-making
Commitment Deviceshares tag: decision-making
Communication Checklistshares tag: decision-making
Completenessshares tag: mathematics
Computabilityshares tag: mathematics
Conditional Value-at-Riskshares tag: statistics
Conjunction Fallacyshares tag: probability
Consistencyshares tag: mathematics
Contrast Principleshares tag: decision-making
Correlationshares tag: statistics
Correlation Coefficientshares tag: statistics
Correlation vs Causationshares tag: statistics
Cost of timeshares tag: decision-making
Course Correctionshares tag: decision-making
Courtroom Probabilityshares tag: probability
Curiosityshares tag: decision-making
Decentralisation of Decisionshares tag: decision-making
Decision Procedureshares tag: mathematics
Decision Ritualsshares tag: decision-making
Decision Theoryshares tag: probability
Decisive Momentsshares tag: decision-making
Delayed Gratificationshares tag: decision-making
Descriptive Statisticsshares tag: statistics
Destiny Instinctshares tag: decision-making
Distribution (Market Phase)shares tag: statistics
Dual-Process Theoryshares tag: decision-making
Ecological Transparencyshares tag: decision-making
Emotion Drives Behaviorshares tag: decision-making
Emotional Masteryshares tag: decision-making
Expected Utilityshares tag: probability
Experimental Designshares tag: statistics
Expert Heuristicshares tag: decision-making
Expert Overreachshares tag: decision-making
Exploration vs. Exploitationshares tag: decision-making
Explore vs. Exploitshares tag: decision-making
Fact vs Opinionshares tag: decision-making
Farsightednessshares tag: decision-making
Fear Instinctshares tag: decision-making
Feedback Loopsshares tag: decision-making
Fibonacci Sequenceshares tag: mathematics
Figure and Groundshares tag: mathematics
First Principlesshares tag: decision-making
Flexibility in tacticsshares tag: decision-making
Formal Systemshares tag: mathematics
Framingshares tag: decision-making
Framing Effectsshares tag: decision-making
Gambler's Fallacyshares tag: probability
Game Theoryshares tag: decision-making
Gap Instinctshares tag: decision-making
Generalization Instinctshares tag: decision-making
Geometryshares tag: mathematics
Global Riskshares tag: decision-making
Goal-Seeking Behaviorshares tag: decision-making
Godel Incompletenessshares tag: mathematics
Gödel Numberingshares tag: mathematics
Good Risk vs Bad Riskshares tag: decision-making
Grand Strategyshares tag: decision-making
Great-Man Theoryshares tag: decision-making
Groupthinkshares tag: decision-making
Halting Problemshares tag: mathematics
Heuristicsshares tag: decision-making
Histogramshares tag: statistics
History of Logicshares tag: mathematics
House Edgeshares tag: probability
Hyperbolic Discountingshares tag: decision-making
Hypothesis Testingshares tag: statistics
Ideologyshares tag: decision-making
Incentivesshares tag: decision-making
Incompletenessshares tag: mathematics
Independenceshares tag: probability
Inference Ruleshares tag: mathematics
Informationshares tag: mathematics
Information Coefficientshares tag: statistics
Information Theoryshares tag: probability
Initiativeshares tag: decision-making
Intellectual Humilityshares tag: decision-making
Interpretationshares tag: mathematics
Intuitionshares tag: decision-making
Invariantshares tag: mathematics
Inversionshares tag: decision-making
Irrevocable Commitmentshares tag: decision-making
Isomorphismshares tag: mathematics
Judgmental Heuristicsshares tag: decision-making
Kolmogorov Axiomsshares tag: probability
Lambda Calculusshares tag: mathematics
Laplaceshares tag: probability
Least Squaresshares tag: statistics
Linear Equationshares tag: mathematics
Linear Regressionshares tag: statistics
Logical Connectivesshares tag: mathematics
Long-Term Thinkingshares tag: decision-making
Loss Aversionshares tag: decision-making
Lotteryshares tag: probability
Marginal Analysisshares tag: decision-making
Markov Chainshares tag: probability
Master Buildershares tag: decision-making
Mathematicsshares tag: mathematics
Meanshares tag: statistics
Mean Reversionshares tag: statistics
Measurement Errorshares tag: statistics
Medianshares tag: statistics
Mental Modelsshares tag: decision-making
Meta-Mathematicsshares tag: mathematics
Migration Decisionshares tag: decision-making
Misleading Statisticsshares tag: statistics
Modus Ponensshares tag: mathematics
Monty Hall Problemshares tag: probability
Moral Compassshares tag: decision-making
The MU-Puzzleshares tag: mathematics
Negative Feedbackshares tag: decision-making
Negativity Instinctshares tag: decision-making
Negotiationshares tag: decision-making
Newcomb Problemshares tag: probability
Non-Euclidean Geometryshares tag: mathematics
Normal Distributionshares tag: probability
Null Hypothesisshares tag: statistics
Number Theoryshares tag: mathematics
Omega-Incompletenessshares tag: mathematics
Opinion vs Factshares tag: decision-making
Opportunity Costshares tag: decision-making
Option Pricingshares tag: probability
Over-Correctionshares tag: decision-making
Overfittingshares tag: statistics
P-Valueshares tag: statistics
Parallel Postulateshares tag: mathematics
Pascal-Fermat Correspondenceshares tag: probability
Peano Arithmeticshares tag: mathematics
Perfection Paralysisshares tag: decision-making
Performance Rankshares tag: statistics
Personal Philosophy of Achievementshares tag: decision-making
Pie Chartshares tag: statistics
Placebo Effectshares tag: statistics
Poisson Distributionshares tag: probability
Possibilismshares tag: decision-making
pq-Systemshares tag: mathematics
Precision vs. Accuracyshares tag: statistics
Predicate Logicshares tag: mathematics
Prime Numbershares tag: mathematics
Primitive Recursiveshares tag: mathematics
Principal Component Analysisshares tag: statistics
Principle Of Indifferenceshares tag: probability
Probability Distributionshares tag: probability
Propositional Calculusshares tag: mathematics
Prosecutor's Fallacyshares tag: probability
Prospect Theoryshares tag: decision-making
Provabilityshares tag: mathematics
Prudenceshares tag: decision-making
Quantifiershares tag: mathematics
Questionnaire Designshares tag: statistics
Queueing Theoryshares tag: probability
Random Sampleshares tag: statistics
Rank Correlationshares tag: statistics
Rational Belief Revisionshares tag: decision-making
Rational Choiceshares tag: decision-making
Rationalityshares tag: decision-making
Reasoningshares tag: decision-making
Recursionshares tag: mathematics
Recursive Setshares tag: mathematics
Recursively Enumerableshares tag: mathematics
Reliabilityshares tag: probability
Response Delayshares tag: decision-making
Returnsshares tag: statistics
Risk Perceptionshares tag: decision-making
Risk Toleranceshares tag: decision-making
Rolling Metricsshares tag: statistics
Rose-Tinted Pastshares tag: decision-making
Sample Sizeshares tag: statistics
Samplingshares tag: statistics
Sampling Biasshares tag: statistics
Scapegoatingshares tag: decision-making
Scarcityshares tag: decision-making
Scarcity Mindsetshares tag: decision-making
Second-Order Thinkingshares tag: decision-making
Selective Reportingshares tag: decision-making
Self-Accountabilityshares tag: decision-making
Shadowboxingshares tag: decision-making
Shortsightednessshares tag: decision-making
Significance Levelshares tag: statistics
Simple, Complicated, Complexshares tag: decision-making
Simplicityshares tag: decision-making
Single Perspective Instinctshares tag: decision-making
Sixth Senseshares tag: decision-making
Somatic Markershares tag: decision-making
Spurious Correlationshares tag: statistics
Standard Deviationshares tag: statistics
Statistical Significanceshares tag: statistics
Stereotypeshares tag: decision-making
Strange Loopshares tag: mathematics
Strategic Thinkingshares tag: decision-making
Strategyshares tag: decision-making
String Rewritingshares tag: mathematics
Tactics vs strategyshares tag: decision-making
Tarski Undefinabilityshares tag: mathematics
Tautologyshares tag: mathematics
Team Coordinationshares tag: decision-making
The Five Factorsshares tag: decision-making
Thought Experimentshares tag: decision-making
Three Feet from Goldshares tag: decision-making
Time Inconsistencyshares tag: decision-making
Time-Series Datashares tag: statistics
Toolkit Thinkingshares tag: decision-making
Truth Tableshares tag: mathematics
Turing Machineshares tag: mathematics
Typographical Number Theoryshares tag: mathematics
Urgency Instinctshares tag: decision-making
Variableshares tag: mathematics
Varianceshares tag: probability
Via Negativashares tag: decision-making
Wealth Decisionshares tag: decision-making
Weather Forecastingshares tag: probability
Winning without fightingshares tag: decision-making
Worldview Checklistshares tag: decision-making
Worldview Updatingshares tag: decision-making
Z-Scoreshares tag: statistics

Probability

Definition

Why it matters

How it works

The axiomatic core

Three interpretations, one algebra

Conditional probability and inductive validity

The reference-class problem

Combinatorics and the structure of finite sample spaces

Probability as a quantitative lens for general reasoning

Distributions, expectations, and the law of large numbers

Where it goes next

Continue exploring

Tags

Probability

Definition

Why it matters

How it works

The axiomatic core

Three interpretations, one algebra

Conditional probability and inductive validity

The reference-class problem

Combinatorics and the structure of finite sample spaces

Probability as a quantitative lens for general reasoning

Distributions, expectations, and the law of large numbers

Where it goes next

Related concepts

Continue exploring

Tags