Appearance
To choose the right chart for historical data, start from your question, not your software's chart menu. Trends over time want a line; comparisons across categories want a bar; part-to-whole wants a stacked bar; relationships between two measures want a scatter. Most "bad chart" problems trace back to forcing a familiar shape onto a question it cannot answer. This troubleshooting guide works backwards from the symptoms.
Symptom: the chart implies data you do not have
This is the most common and most damaging error. A line drawn through decadal census points implies you know every year between — you do not.
Root cause: a continuous chart type on discrete data.
Fix: use a step chart or markers only for stock-on-a-date values, and break the line at missing years.
python
# Census populations at 1801, 1811, 1821 ...
df["pop"].plot(drawstyle="steps-post", marker="o")
# or, to expose gaps:
df = df.reindex(range(1801, 1851)) # missing years -> NaN -> visible breakSymptom: categories that should be comparable are hard to read
If readers must squint to compare slices, you have probably reached for a pie or donut.
Root cause: angle is far harder to judge than length.
Fix: switch to a horizontal bar chart sorted by value. A pie is defensible only for two or three parts of a single whole — never for change over time.
How do I match question to chart?
Use this decision table as a first pass:
| Your question | Chart | Common mistake to avoid |
|---|---|---|
| How did one measure change over time? | Line | Don't use bars for long continuous series |
| How do categories compare at one time? | Bar (sorted) | Don't use a pie for >4 slices |
| How did a composition shift over time? | Stacked area / 100% bar | Watch baseline distortion in stacked areas |
| How are two measures related? | Scatter | Don't draw a trend line through tiny n |
| How is one value distributed? | Histogram / boxplot | Don't pick bins that hide bimodality |
| How does a flow move place to place? | Sankey / flow map | Don't use for vague or sparse links |
Symptom: two series look suspiciously correlated
A dual-axis chart with population on the left and prices on the right can make any two lines hug each other.
Root cause: independent scales you can slide until the curves align — visual coincidence, not evidence.
Fix: index both series to a common base year (value / base * 100) and plot on one axis, or use small multiples so each keeps its own honest scale.
Symptom: the trend is buried in noise
Annual counts jitter and the eye loses the signal.
Root cause: raw discrete counts plotted alone.
Fix: keep the raw series faint and overlay a centred rolling mean; or, for comparing many regions, replace one crowded chart with small multiples, one panel each. Small multiples beat a rainbow of overlapping lines almost every time.
When does a specialised chart earn its place?
Flow maps, Sankey diagrams, and network graphs are powerful but easy to misapply. Reach for them only when the relationship is the point — migration between regions, money between accounts, correspondence between people. If the data is sparse or the links are vague, a clean table often communicates more honestly than an impressive-looking diagram that readers cannot trace.
Key Takeaways
- Pick from the question, not the chart-type toolbar.
- Never draw a continuous line through discrete or missing data — use steps, markers, or breaks.
- Replace pies of more than a few slices with sorted bar charts.
- Avoid dual axes; index to a base year or use small multiples instead.
- Surface buried trends with a faint raw series plus a rolling mean.
- Reserve Sankey, flow maps, and networks for cases where the relationship is the message.
Frequently Asked Questions
How do I choose the right chart type for historical data?
Start from the question, not the chart: trends over time call for lines, comparison across categories for bars, part-to-whole for stacked bars, and relationships between two variables for scatter plots; let the data structure pick the form.
Why does my historical pie chart look wrong?
Pie charts hide change and become unreadable past four or five slices; if you are comparing categories or showing how a share shifts over time, a bar chart or a stacked-area chart almost always communicates better.
When should I use a bar chart instead of a line chart?
Use bars for distinct, comparable categories or for counts at discrete moments such as census years; use lines only when the x-axis is continuous and interpolation between points is meaningful.
My chart implies data I do not have — how do I fix it?
Switch a line to a step or marker-only chart so it stops drawing through gaps, break the line at missing values, and add a caption stating which years are interpolated or absent.
Is a dual-axis chart ever acceptable in history?
Rarely; two y-axes invite spurious correlation because you can slide the scales to make any two series appear linked, so prefer indexed lines or small multiples instead.