Skip to content
Historical Data Visualisation

To choose the right chart for historical data, start from your question, not your software's chart menu. Trends over time want a line; comparisons across categories want a bar; part-to-whole wants a stacked bar; relationships between two measures want a scatter. Most "bad chart" problems trace back to forcing a familiar shape onto a question it cannot answer. This troubleshooting guide works backwards from the symptoms.

Symptom: the chart implies data you do not have

This is the most common and most damaging error. A line drawn through decadal census points implies you know every year between — you do not.

Root cause: a continuous chart type on discrete data.

Fix: use a step chart or markers only for stock-on-a-date values, and break the line at missing years.

python
# Census populations at 1801, 1811, 1821 ...
df["pop"].plot(drawstyle="steps-post", marker="o")
# or, to expose gaps:
df = df.reindex(range(1801, 1851))   # missing years -> NaN -> visible break

Symptom: categories that should be comparable are hard to read

If readers must squint to compare slices, you have probably reached for a pie or donut.

Root cause: angle is far harder to judge than length.

Fix: switch to a horizontal bar chart sorted by value. A pie is defensible only for two or three parts of a single whole — never for change over time.

How do I match question to chart?

Use this decision table as a first pass:

Your questionChartCommon mistake to avoid
How did one measure change over time?LineDon't use bars for long continuous series
How do categories compare at one time?Bar (sorted)Don't use a pie for >4 slices
How did a composition shift over time?Stacked area / 100% barWatch baseline distortion in stacked areas
How are two measures related?ScatterDon't draw a trend line through tiny n
How is one value distributed?Histogram / boxplotDon't pick bins that hide bimodality
How does a flow move place to place?Sankey / flow mapDon't use for vague or sparse links

Symptom: two series look suspiciously correlated

A dual-axis chart with population on the left and prices on the right can make any two lines hug each other.

Root cause: independent scales you can slide until the curves align — visual coincidence, not evidence.

Fix: index both series to a common base year (value / base * 100) and plot on one axis, or use small multiples so each keeps its own honest scale.

Symptom: the trend is buried in noise

Annual counts jitter and the eye loses the signal.

Root cause: raw discrete counts plotted alone.

Fix: keep the raw series faint and overlay a centred rolling mean; or, for comparing many regions, replace one crowded chart with small multiples, one panel each. Small multiples beat a rainbow of overlapping lines almost every time.

When does a specialised chart earn its place?

Flow maps, Sankey diagrams, and network graphs are powerful but easy to misapply. Reach for them only when the relationship is the point — migration between regions, money between accounts, correspondence between people. If the data is sparse or the links are vague, a clean table often communicates more honestly than an impressive-looking diagram that readers cannot trace.

Key Takeaways

  • Pick from the question, not the chart-type toolbar.
  • Never draw a continuous line through discrete or missing data — use steps, markers, or breaks.
  • Replace pies of more than a few slices with sorted bar charts.
  • Avoid dual axes; index to a base year or use small multiples instead.
  • Surface buried trends with a faint raw series plus a rolling mean.
  • Reserve Sankey, flow maps, and networks for cases where the relationship is the message.

Frequently Asked Questions

How do I choose the right chart type for historical data?

Start from the question, not the chart: trends over time call for lines, comparison across categories for bars, part-to-whole for stacked bars, and relationships between two variables for scatter plots; let the data structure pick the form.

Why does my historical pie chart look wrong?

Pie charts hide change and become unreadable past four or five slices; if you are comparing categories or showing how a share shifts over time, a bar chart or a stacked-area chart almost always communicates better.

When should I use a bar chart instead of a line chart?

Use bars for distinct, comparable categories or for counts at discrete moments such as census years; use lines only when the x-axis is continuous and interpolation between points is meaningful.

My chart implies data I do not have — how do I fix it?

Switch a line to a step or marker-only chart so it stops drawing through gaps, break the line at missing values, and add a caption stating which years are interpolated or absent.

Is a dual-axis chart ever acceptable in history?

Rarely; two y-axes invite spurious correlation because you can slide the scales to make any two series appear linked, so prefer indexed lines or small multiples instead.