Skip to content
Historical Data Visualisation

Visualise gaps in the record when you can tell a true absence apart from a survival or sampling artefact, and when that absence changes how a reader should interpret your data. If a gap is just "we have not digitised this yet" or "the sample stops here", a sentence in your methods note is more honest than a chart that dramatises nothing. The technique earns its place when missingness is itself evidence — a destroyed register, a censored topic, a population the record-keeping system never counted.

Is a visible gap actually telling the reader something?

Before reaching for an encoding, ask what the empty space means. There are three common situations and only one of them rewards a dramatic gap visual.

SituationWhat the gap representsBest treatment
Lost or destroyed sourceTrue absence with a known causeShow and annotate the gap
Sampling boundaryYour collection stops, the world did notCoverage note, trimmed axis
Not-yet-processedPipeline status, not historyStatus field, not a history chart

If your gap is the first row, visualise it. If it is the second or third, a chart that foregrounds the hole misleads readers into treating a project limitation as a historical finding.

How do I show "no data" differently from "a value of zero"?

This is the single most common failure. A bar of height zero and a missing bar look almost identical, yet they mean opposite things — "we counted, and there were none" versus "we never counted". Give them separate encodings.

python
import pandas as pd
import numpy as np

# expected: one row per year 1850-1900
years = pd.RangeIndex(1850, 1901, name="year")
counts = df.set_index("year").reindex(years)  # missing years -> NaN

counts["state"] = np.where(
    counts["n"].isna(), "no_record",
    np.where(counts["n"] == 0, "true_zero", "observed"),
)

In Vega-Lite or ggplot, map no_record to a hatched or grey swatch outside the colour ramp, keep true_zero on the scale, and never let a NaN render as blank canvas the eye reads as zero.

Which chart types handle missing periods best?

For time series, break the line — do not let your plotting library interpolate across a missing span. In Matplotlib, insert np.nan rows so the line is severed automatically:

python
import matplotlib.pyplot as plt
s = counts["n"]            # NaN at gap years
plt.plot(s.index, s.values)   # nan breaks the line, no fake slope
plt.axvspan(1861, 1865, color="0.85", zorder=0)  # shade the gap
plt.annotate("courthouse fire, 1861", xy=(1861, 0))

A coverage strip beneath a timeline works even better for non-specialist readers: a thin horizontal band, green where sources survive, hatched where they do not. Calendar heatmaps (one cell per month or year) make multi-decade gaps legible at a glance, provided NA gets its own swatch. Avoid smooth/spline interpolation, stacked areas across gaps, and anything that manufactures values you never observed.

How do I avoid implying a trend across a gap?

Interpolation is a quiet lie. A connected line across a ten-year hole tells the eye there was a steady change when you have no evidence of the path. Three rules:

  1. Break, do not bridge. Sever the line at the gap.
  2. Shade and label. A grey band plus "no surviving returns, 1916-1919" beats silence.
  3. If you must connect, dash and caption. A clearly labelled dashed segment ("no data — schematic only") is acceptable; a solid line is not.

Can I quantify the gap before I draw it?

Yes — and a measured gap is far stronger than an asserted one. Compute a coverage ratio per bin: observed units over expected units, where "expected" comes from a finding aid, accession register, or external series.

python
expected = pd.Series(12, index=years)       # 12 monthly issues per year
observed = df.groupby("year")["issue"].nunique()
coverage = (observed / expected).reindex(years).fillna(0)
# plot coverage as its own layer; a ratio under 0.5 flags a real hole

Plot completeness as a separate panel under your main chart. Now the gap is a number a reader can audit, not a rhetorical flourish.

When should I not use this technique?

Skip gap visualisation when the absence is structural rather than archival — a silence rather than a gap. People the census never enumerated, labour the ledgers never priced, voices no institution recorded: these will not show up as missing cells because the system was never built to count them. A chart of "missing" data here implies the record almost captured them, which flattens the politics of who gets documented. Frame silences in prose, cite the scholarship on archival absence, and reserve the chart for what was genuinely measured and lost.

Key Takeaways

  • Visualise a gap only when it is a true absence with meaning, not a pipeline or sampling artefact.
  • Encode "no data" distinctly from "zero" — hatched or grey, outside the value ramp.
  • Break time-series lines at gaps; never interpolate a trend you did not observe.
  • Add a coverage strip or completeness ratio so the gap is measured, not merely asserted.
  • Distinguish a quantifiable gap from a structural silence; the latter needs narrative, not a chart.
  • Always annotate the cause of a gap (fire, censorship, lost custody) next to the empty span.
  • A footnote about coverage often beats a dramatic chart of nothing.

Frequently Asked Questions

Should I always visualise the gaps in my data?

No. Visualise a gap only when you can distinguish a true absence (the event did not happen or was never recorded) from a survival or sampling artefact, and when the gap carries meaning for your argument. Otherwise a footnote describing coverage is more honest than a chart that implies measured nothingness.

How do I show "no data" differently from "a value of zero"?

Use distinct visual encodings: render zero as a normal bar or point on the scale, and render missing data as a hatched, greyed, or explicitly labelled cell — never as an empty space the eye reads as zero. In a heatmap, give NA its own colour outside the value ramp.

What's the difference between a gap and a silence in the archive?

A gap is missing data you can often quantify or bound (a lost register, a year with no surviving issues). A silence is a structural absence — people or events the record-keeping system never documented. Gaps suit quantitative visualisation; silences usually need narrative framing alongside any chart.

Which chart types handle missing periods best?

Broken line charts (with the line interrupted, not interpolated), gap-aware heatmaps and calendar grids, coverage strips beneath a timeline, and step charts. Avoid smooth interpolation and stacked areas across gaps, because they invent values that were never observed.

How do I avoid implying a trend across a gap?

Break the line at the gap rather than connecting across it, shade the missing span, and annotate the cause. If you must show a connection, use a dashed segment clearly labelled "no data" so readers do not read interpolation as evidence.

Can I quantify how complete my source is before visualising?

Yes. Compute a coverage ratio (observed units over expected units) per time bin or category, document expected counts from finding aids or accession registers, and plot completeness as its own layer so the gap is measured, not merely asserted.