Concept

Misleading Statistics

Definition

Misleading statistics is a broad name for the recurring techniques — some accidental, many deliberate — that produce a true number while creating a false impression. The data itself is rarely wrong; what is wrong is the framing, the comparison, or the visualisation. Truncated axes, cherry-picked baselines, selective summary statistics, and confusion of correlation with causation are the most common offenders.

A statistical literacy strong enough to spot these techniques is now a basic civic skill. Anyone reading a chart in a newspaper or a slide in a meeting needs to ask not just what the numbers say but how the numbers were chosen, what they leave out, and which comparison they invite.

Why it matters

How it works

The classic visual trick is the truncated axis. A bar chart of last year's profit and this year's profit looks dramatically different depending on whether the y-axis starts at zero or at a value just below the smaller bar. Starting at a high value exaggerates small relative differences; starting at zero shows them in context. The same trick applies to line charts and to dual-axis comparisons where two series with unrelated units share a single panel. Statistically literate readers learn to check the axis before reading the shape.

The arithmetic tricks are subtler. Reporting only the mean of a skewed dataset hides the long tail; reporting only the median hides the spread. Using a relative-risk figure ("incidence doubled") without the underlying absolute numbers ("from one case per million to two") makes tiny effects sound enormous. Choosing a baseline year that happens to be an unusual low or high — picking 2020 to anchor an unemployment trend, for example — turns ordinary recovery into a dramatic narrative. Selection effects compound the problem: a survey of the people who chose to respond is not a survey of the population. Each of these patterns can be defeated by asking three questions before believing a chart: what is being compared, how is the comparison framed, and what would the same data look like under a different but equally fair frame.

Where it goes next

Continue exploring

Tags