Mathematics

12 min read

Core idea

Math as intuition, not arithmetic

Shakuntala Devi called mathematics "a systematic effort of solving puzzles posed by nature." This topic does not ask you to compute anything. It asks you to absorb nine quantitative shapes — distributions, compounding curves, sampling logic, the difference between random and patterned, the gravity of the mean, the destructive power of zero, the equivalence of dissimilar things, the trade-offs of surface area, and the topography of maxima — and to use those shapes as default lenses when the underlying numbers are messy, missing, or untrustworthy.

Why the math topic sits next to the systems topic

The systems topic taught you to see the structure of how parts interact. The mathematics topic teaches you to reason about the quantities flowing through that structure. Together, they form a complete kit: structure plus quantity. A startup is a system of feedback loops; whether it survives depends on whether its growth rate compounds faster than its churn rate. A scientific claim is a system of evidence; whether you should believe it depends on the sample size and the distribution of possible findings. The two topics are halves of the same toolkit.

Why it matters

Most reasoning errors are quantitative errors in disguise

When you confuse an anecdote for evidence, you are misusing sampling. When you mistake a hot streak for a permanent improvement, you are forgetting regression to the mean. When you assume that ten years of preparation will pay back linearly, you are missing compounding. When you accept a deal in which one critical term is zero, you are forgetting that anything multiplied by zero is zero. The errors look different on the surface, but underneath they share a structure: a missing or mis-applied quantitative intuition.

The numbers people have only because they think this way

People who appear to be unusually good with numbers are rarely faster at arithmetic. They are faster at recognizing which of these nine shapes a situation has — and therefore which intuitions apply and which would mislead. That recognition is the skill the topic is trying to install.

Key takeaways

The models in this domain

Distributions

A distribution is the shape made when you plot how often each value of a quantity occurs in a population. The normal distribution — the bell curve — is symmetric, peaked at the mean, with values rarer the farther they fall from the centre. Human height, blood pressure, IQ scores, measurement errors, and the price of a familiar commodity all roughly follow it. Once you know the mean and the spread, you know almost everything you need to know.

The power law distribution is shaped completely differently: most observations cluster near one end, with a long tail of values that are vastly larger. Income, city populations, book sales, the size of wildfires, and the death toll from wars all follow power laws. In a power-law world, the mean is misleading — a few extreme values dominate the total, and "normal" hides the tail risk that actually matters. Christian and Griffiths' rule of thumb is striking: in a normal distribution, something that has gone on a long time is overdue to end; in a power law distribution, the longer something has gone on, the longer you should expect it to continue.

The practical move is to identify which kind of distribution you are in before applying any intuition. Insurance, finance, public-health policy, and engineering all routinely fail when people treat power-law domains with normal-distribution thinking — and the failures show up in the tails, where the worst things happen.

Compounding

Compounding is exponential growth: each period's gain becomes part of next period's base, so the curve gets steeper the longer it runs. Money in a savings account compounds; so do knowledge, relationships, reputation, fitness, and skill. Naval Ravikant puts it bluntly: "All the returns in life, whether in wealth, relationships, or knowledge, come from compound interest."

The mind handles compounding badly because the early part of the curve is flat enough to look like stagnation. A dollar at 10% becomes $1.10, then $1.21, then $1.33 — disappointingly slow. But the same dollar after 30 years is $17.45, and after 50 years $117.39. Most of the action is at the end. People who quit halfway through never see the part that justified the effort.

The model has a dark mirror. Bad habits, debt, and unresolved conflict also compound. A small grievance unaddressed each week is a different relationship six months from now. The same exponential shape that builds fortunes destroys them. The practical instruction is: identify what is compounding in your life, accelerate the good loops, and intervene fast in the bad ones — because the longer either runs, the more dramatic the consequences.

Sampling

Sampling is how you learn about a population by examining a subset. Done well, a sample of a few hundred can describe a population of millions to within a few percentage points. Done poorly — small, biased, self-selecting — it produces conclusions that feel confident and mislead spectacularly. Most "evidence" people argue from is sampling-poor.

Morgan Housel's line captures the deepest sampling trap: "Your personal experiences with money make up maybe 0.00000001% of what's happened in the world, but maybe 80% of how you think the world works." The same is true for politics, parenting, careers, food, and travel. Your sample is whatever you happened to live through, in whatever years you happened to be born, in whatever country your parents happened to live in. It is microscopic, biased, and yet feels like the truth.

To sample better: deliberately expose yourself to data from outside your own slice — books, statistics, conversations with people unlike you. Be suspicious when a conclusion is drawn from "I know a guy who…" When you cannot enlarge the sample, at least name the sampling limit ("this is one data point") so that you weight it accordingly.

Randomness

Most of what happens around us is partly random — outcomes shaped by causes we cannot see or by chance itself. The mind is built to find patterns; it sees faces in clouds and motives in coincidences. Leonard Mlodinow argues that this pattern-hunger systematically misleads us when the underlying process is actually random.

The two costs of underestimating randomness are mirror images. We attribute skill where there was luck — the hedge-fund manager whose three-year streak was inside the range of pure chance for thousands of managers, the founder whose timing the market did all the work. And we attribute meaning where there was noise — the unlucky week we read as a sign, the random rejection we make about ourselves.

The practical move is to ask, before drawing any conclusion from a small number of events: if this were just luck, would it look any different? If the answer is no, your conclusion is at most a hypothesis to test with more data, not a fact. Process beats outcome for exactly this reason — a good process can produce a bad outcome (and vice versa) on any given roll of the dice, but it wins out over enough trials.

Regression to the mean

When a quantity is partly skill and partly luck, extreme outcomes — the best month, the worst quarter, the standout rookie season — are usually followed by less extreme ones. The luck component cannot stay maximally favorable; the average pulls outcomes back toward itself. This is regression to the mean, and it is one of the most counter-intuitive forces in everything from sports to medicine to management.

The classic illustration: a pilot's bad landing gets reprimanded; the next landing is better. A pilot's great landing gets praised; the next landing is worse. The instructor concludes that punishment works and praise backfires. In fact, both followups regressed toward the mean — what looked like the effect of feedback was the gravity of average pulling extremes back. Kahneman gives this example in Thinking, Fast and Slow as one of his most durable insights.

The model rescues you from two errors: chasing the hot streak as if it will continue, and over-correcting after the cold one as if it will define you. Reward and punish behaviors, not outcomes. The latter will regress on their own; the former is what you can actually control.

Multiplying by zero

Any number multiplied by zero is zero. Two billion times zero is zero. A merger valued at $10 billion, but conducted under a fraudulent accounting figure, is worth zero. A startup with brilliant tech, a stellar team, and a market that does not want the product is worth zero. A relationship with deep love but a broken core trust is worth zero. The mental model directs you to find the zero — the load-bearing piece that, if missing, makes all the other excellence irrelevant.

In multiplicative systems, each input bounds the whole. There is no compensating for a zero by maxing out the other terms. A great brand with a broken product line cannot brand its way out. A great strategy with no execution capability cannot strategy its way out. The first question to ask of any opportunity is not "what is good about this?" but "is there a zero in it?"

The corollary is constructive: when you are trying to improve a system, look for the smallest factor first. Doubling a 0.1 to 0.2 doubles the product. Doubling an 8 to 16 only adds 8 to a product that has a 0.1 in it. The leverage is at the bottleneck term, not the top one.

Equivalence

Different things can produce equal results. Two routes to the same destination, two recipes that yield the same dish, two cultures that solve the same problem with completely different rituals. Equivalence as a model says: when the surface differs but the structure does the same job, treat them as instances of the same thing.

In practice, the move has two directions. Look for hidden equivalence to apply solutions across domains: the math that describes heat flow also describes information flow; the strategies that work for retail customer retention also work for fundraising-donor retention; the techniques negotiators use to resolve commercial disputes also work in your marriage. Look for surface-level differences that hide functional equivalence to avoid duplicated work: two teams building "different" tools that turn out to do the same thing; two parts of your life pursuing the same underlying goal through unrelated activity.

The deepest use of equivalence is in physics and math, where vastly different phenomena turn out to be governed by the same equations. The same intellectual move applies in everyday life: when you can see that two things are equivalent in structure, you double your library of strategies for handling each.

Surface area

Surface area is the amount of a system in contact with the outside world. A teaspoon of sugar dissolves faster than a sugar cube because the spoonful exposes more surface. Lungs and intestines evolve elaborate folds to maximize contact area for gas and nutrient exchange. Arctic mammals evolve compact bodies to minimize surface area for heat loss.

The same trade-off governs your life. More surface area means more reactions: more meetings means more ideas, more openness means more learning, more publishing means more feedback, more travel means more luck. Less surface area means more protection: fewer commitments means more focus, fewer audiences means less criticism, fewer attack vectors means less risk.

The choice is not "more is better" or "less is better" — it is to tune surface area to what you are trying to achieve right now. Growing? Open up. Recovering? Pull back. The error is staying at the same surface area across phases of life that demand different ones.

Global and local maxima

The maximum of a function in a given range is its highest value there. A local maximum is the highest point in a small neighborhood — the top of one hill. A global maximum is the highest point overall — Mount Everest among all hills. The model is geometric, but it describes every situation in which you have to choose between the best nearby option and the best possible option.

Most life choices begin at a local maximum. Your current job is decent. Your current city is fine. Your current way of doing things works. Climbing higher from where you stand makes things modestly better. To reach a much higher peak somewhere else, you have to temporarily go down — quit the job, move, throw out the working code and start over. The mathematics of search confirms what people knew anyway: "we may need to temporarily worsen our solution if we want to continue searching for improvements."

The practical lesson is twofold. First, recognize when you are at a local maximum and assume there is something higher elsewhere worth the descent. Second, when refining within a peak, get the macro alignment right before polishing the details — there is no point optimizing a sandcastle when the tide is coming in.

Mental model

Mental model

Practical application

When you face a decision under uncertainty, run it through this checklist before committing:

Example

Why most career advice misleads you

A successful founder writes a book about the moves that built their company. The advice sounds concrete and you adopt it. Months later, nothing has changed for you.

Run the quantitative checklist. Sampling: the book is one data point — the founder who succeeded and decided to publish. The thousands of founders who did exactly the same things and failed wrote no books, so they are invisible in your data. Randomness: a non-trivial fraction of what worked for the author was the year they happened to start, the macroeconomic tailwind, the partner who happened to be available. None of that transfers. Regression to the mean: the author's first company was an extreme positive outlier; their next venture, statistically, will be more average — which is why second books from celebrated founders usually disappoint. Power-law distribution: outcomes in startups follow a power law, so the median experience is failure even when the headline cases are huge wins. Multiplying by zero: copying the founder's tactics while missing one critical factor — their network, their timing, their hidden funding source — multiplies all the other work by zero.

None of this means the book is worthless. It means the right way to read it is as one data point with strong selection bias, in a power-law domain dominated by randomness. Take the structural insights, discard the specific tactics, and look for principles that show up across many such books — equivalence again, applied to your reading. That is how the quantitative lenses turn ordinary advice into useful signal.

Continue exploring

Tags