Choosing a suitable graph

3 min read

Core idea

The chart-selection question collapses into a two-axis grid. One axis is the shape of the data: discrete (separate categories or whole-number counts) versus continuous (positions on a number scale), with paired as a third option when two values travel together. The other axis is the kind of judgement: describe the data, compare two batches, or relate two variables. The intersection of your row and column picks the chart.

Why it matters

Most chart mistakes are not technical errors — they are mismatches between data type and chart type. A pie chart of running times is wrong because times don't sum to anything meaningful; a touching-bar bar chart is wrong because the categories aren't on a number line; a scatter graph with the dots joined up is wrong because the points aren't ordered. Pin down the data type before you pick the chart and these failures vanish.

Mental model

Data type taxonomy

Before any chart, classify your data along two cuts: numerical or not, and (if numerical) discrete or continuous.

Data type taxonomy

The judgement × data-type grid

The chart picks itself once you've named both inputs. This grid summarises the choice.

The judgement × data-type grid

Practical application

To stop second-guessing chart choices, run this three-step decision every time.

  1. Classify the data. Is it categorical (attribute) or numerical (variable)? If numerical, discrete or continuous? Is each row a single measurement or a pair?

  2. Name the judgement. Are you describing a single batch, comparing two or more batches, or exploring a relationship between two variables?

  3. Read the grid. Discrete + describe → bar or pie. Continuous + describe → histogram or stemplot. Discrete + compare → compound bar. Continuous + compare → paired histograms or back-to-back stemplot. Paired + relate → scatter (non-time) or time graph (time-ordered).

Example

You are reviewing a customer-experience dashboard for an airline. Six potential charts come to mind. Run each through the diagnostic.

  • Flight delay reason (mechanical, weather, crew, ATC, other) across all flights this quarter. Discrete attribute, describing. → bar chart sorted by frequency, or a pie chart (the whole pie is meaningful — it's 100% of delayed flights).
  • Distribution of delay length in minutes for all delayed flights. Continuous variable, describing. → histogram with 10-minute class intervals, touching bars, area shows frequency.
  • Delay length distribution for short-haul vs long-haul, side by side. Continuous variable, comparing two batches. → paired histograms or a back-to-back stemplot.
  • Per-customer baggage weight vs ticket class. One continuous, one categorical → grouped boxplots (introduced in the next topic).
  • Customer satisfaction score vs flight delay length, per customer. Two continuous variables, relating. → scatter graph; do not join the dots.
  • Monthly on-time performance over the past two years. Time-ordered paired data. → time graph; dots joined.

Six analytical questions about the same operation produce six structurally different charts. Picking the chart is not an aesthetic decision — it is a commitment to a particular question and a particular kind of data.

Continue exploring

Tags