Definition
Hypothesis testing is a structured procedure for using sample data to decide between two competing claims about a population. The default claim — the null hypothesis, written H0 — typically asserts no effect, no difference, or no relationship. The rival claim — the alternative hypothesis, H1 — asserts that some effect exists. The test asks whether the observed data would be surprising if the null were true; if the data is sufficiently unlikely under the null, the analyst rejects it in favour of the alternative.
The procedure is the foundation of statistical inference and the standard reporting format across empirical science. Its central output is a p-value: the probability of seeing data at least as extreme as what was observed, assuming the null is correct.
Why it matters
How it works
A standard test proceeds in five steps. First, state the null and alternative hypotheses in measurable terms — for example, the population mean equals a specific value, or two group means are equal. Second, choose a significance level (commonly 0.05) that fixes how much risk of a false-positive error you will tolerate. Third, compute a test statistic from the sample — a t-statistic, a z-score, or a chi-squared value — that measures how far the observed data sits from what the null predicts. Fourth, convert that statistic into a p-value using the relevant theoretical distribution. Fifth, compare the p-value to the significance level: if smaller, reject the null; if larger, fail to reject it.
Two error types frame the trade-off. A Type I error rejects a null that is actually true (false positive); a Type II error fails to reject a null that is actually false (false negative). Tightening the significance level reduces Type I errors but raises Type II errors, and vice versa. Sample size is the lever that improves both — more data lets the test detect smaller true effects without inflating false positives.