Definition
Simpson's paradox is the statistical phenomenon in which a trend present in several sub-groups of a population reverses or disappears when the sub-groups are aggregated. A treatment can appear better than a control within every patient sub-group, yet appear worse when the data is pooled — and vice versa.
Named after the statistician Edward Simpson (1951), the paradox arises whenever the relative sizes of the sub-groups are correlated with both the treatment and the outcome — i.e. whenever there is a lurking confounder behind the aggregation.
Why it matters
How it works
Consider two treatments A and B applied across two patient sub-groups. Treatment A might cure 80% of sub-group 1 and 30% of sub-group 2; treatment B 90% and 40%. B wins in both sub-groups. But if A is mostly given to sub-group 1 (easy cases) and B mostly to sub-group 2 (hard cases), the pooled cure rates can show A winning overall.
The resolution depends on what the patient-mix represents. If sub-group membership is a confounder you should adjust for (severity of disease), the within-group comparison is right. If sub-group membership is itself caused by treatment (a mediator), the pooled comparison may be right. Without causal context, the data alone cannot tell you which.