Definition
The correlation coefficient — most often Pearson's r — is a single number that captures how tightly two quantitative variables move together along a straight line. Its value runs from negative one through zero to positive one. A coefficient of positive one means the two variables rise together along a perfect straight line; negative one means one rises exactly as the other falls; zero means there is no linear relationship at all. Intermediate values describe partial alignment, with magnitudes near one indicating tight linear relationships and magnitudes near zero indicating loose ones.
The coefficient is dimensionless: it does not depend on the units in which the variables are measured. Converting height from inches to centimetres or income from dollars to euros leaves r unchanged. This makes it a useful common currency for comparing the strength of relationships across different studies and different scales.
Why it matters
How it works
Pearson's correlation coefficient is computed by standardising each variable to zero mean and unit variance, multiplying the standardised pairs, and averaging the products across the data set. The mathematics is exactly the covariance between the two variables divided by the product of their standard deviations. The standardisation strips out scale, which is why the resulting number is bounded between negative one and positive one regardless of the original units.
Interpretation requires care. The coefficient summarises linear association only; two variables linked by a strong but curved relationship — for example, an inverted-U shape — can have a correlation close to zero. A few extreme values can either inflate or suppress the coefficient relative to the bulk of the data. And the magnitude of correlation that counts as meaningful depends on context: r equal to zero point three is unremarkable in psychology, where many influences are at play, but striking in physics, where confounders are usually controlled. Always pair the coefficient with a scatter plot and an honest discussion of sample size and outliers.