Concept

Histogram

Definition

A histogram is a bar chart that displays the frequency distribution of a continuous or grouped numerical variable. The horizontal axis is divided into class intervals (bins) of equal width, and the height of each bar represents the count — or proportion — of observations that fall within that bin. Unlike a regular bar chart, the bars in a histogram touch one another, because the variable on the x-axis is continuous rather than categorical.

The shape of a histogram is the first thing a statistician looks at when meeting a new dataset. It reveals where the data is centred, how spread out it is, whether it is symmetric or skewed, and whether there are multiple peaks or unusual outliers.

Why it matters

How it works

To build a histogram, you first decide on a number of equal-width intervals that span the range of your data. A common rule of thumb is to use between five and twenty bins depending on sample size; tools often default to Sturges' rule or the Freedman-Diaconis rule. You then count how many observations fall into each interval and draw a bar of that height above the corresponding range on the x-axis. Bars are drawn touching because the underlying variable is continuous: the gap between bins is conceptual, not real.

What you read from the shape is more important than the exact bar heights. A single tall peak in the middle with symmetric tails suggests a roughly normal distribution. A long tail to the right (positive skew) is common in income, response times, and waiting periods. Two clear peaks (bimodality) usually means the dataset is a mixture of two populations that should be analysed separately. Isolated bars far from the main mass flag outliers worth investigating before any statistical test is run.

Where it goes next

Continue exploring

Tags