Definition
Principal Component Analysis, or PCA, is a linear technique that finds new axes for a dataset such that the first axis captures the most variance in the data, the second captures the most of what remains while staying orthogonal to the first, and so on. Each new axis is called a principal component, and each is a weighted linear combination of the original variables. By keeping only the leading components, an analyst can describe most of the dataset shape with far fewer numbers than the original variable count.
Mathematically, PCA is the eigendecomposition of the data covariance matrix, or equivalently the singular value decomposition of the mean-centered data matrix. The eigenvalues rank the components by the variance they explain; the eigenvectors give the rotation that maps original coordinates onto principal axes. Despite the linear-algebra machinery, PCA is best understood geometrically: it finds the natural orientation of the cloud and re-expresses every point in those coordinates.
Why it matters
How it works
The recipe is short. Center each feature on its mean, optionally divide by its standard deviation to put features on equal footing, then compute the covariance matrix and take its eigendecomposition. The eigenvectors form an orthogonal basis; sort them by eigenvalue in descending order and you have a ranking of directions by explained variance. Project the original data onto the top k eigenvectors and you have a k-dimensional embedding that preserves as much of the original spread as any k-dimensional linear projection can.
The interpretive payoff comes from inspecting the loadings — the coefficients that say how much each original variable contributes to each component. In equity returns the first component often looks like a market factor: every stock loads positively. In yield curves the first component shifts every tenor in parallel and is called level. The second adds long-tenor weight and subtracts short-tenor weight, producing the slope component. PCA does not name these axes — that is the analyst job — but it reliably surfaces them when the structure is there.