Appearance
To visualise history with ggplot2 well, build each chart from a tidy data frame, map variables to aesthetics deliberately, plot rates rather than raw counts when populations change, make uncertainty visible, and lock a single house theme so every figure in a collection matches. ggplot2's layered grammar lets you do all of this declaratively, which is exactly what makes historical figures consistent and defensible.
How does the grammar of graphics help historians?
ggplot2 builds a plot in layers: data, aesthetic mappings, geoms, scales, and theme. You describe what maps to what, not pixel positions. A minimal time series of burials:
r
library(ggplot2)
ggplot(burials, aes(x = year, y = count)) +
geom_col(fill = "grey30") +
labs(title = "Burials, St Mary's parish, 1700-1799",
x = NULL, y = "Burials per year",
caption = "Source: Parish register, transcribed 2024")Because each component is explicit, you can swap geom_col for geom_line, or add a smoother, without rebuilding from scratch.
Should you plot counts or rates?
This is the most consequential decision in historical visualisation. If the underlying population grew, a rising count of, say, criminal convictions may mean nothing more than more people. Normalise:
r
convictions |>
mutate(rate_per_1000 = (n / population) * 1000) |>
ggplot(aes(year, rate_per_1000)) +
geom_line()Use raw counts only for genuinely closed corpora, such as a fixed bundle of correspondence, where there is no population at risk to standardise against.
How do you make historical date axes behave?
ggplot2 will not place decades sensibly if your dates are text. Convert first, then control the breaks:
r
events |>
mutate(date = lubridate::ymd(date)) |>
ggplot(aes(date, value)) +
geom_line() +
scale_x_date(date_breaks = "20 years", date_labels = "%Y")For year-only data, treat the year as numeric and use scale_x_continuous(breaks = seq(1700, 1900, 50)).
How do you show uncertainty honestly?
Historical figures are often estimates. Visual weight should reflect confidence:
r
ggplot(estimates, aes(year, mid)) +
geom_ribbon(aes(ymin = low, ymax = high), fill = "grey80") +
geom_line(colour = "grey20")Other tactics: lower alpha on interpolated points, dash reconstructed segments with linetype, or facet attested versus modelled data. The aim is that a reader never mistakes an educated guess for a record.
How do you keep colour accessible?
Colour-code by category only with a tested palette, and never let colour carry the meaning alone:
r
ggplot(df, aes(year, value, colour = region, linetype = region)) +
geom_line() +
scale_colour_viridis_d()Viridis is perceptually uniform and colourblind-safe. Adding linetype means the chart survives greyscale printing, which still matters for journal figures.
How do you enforce a house style?
Set the theme once and reuse it everywhere:
r
theme_set(
theme_minimal(base_size = 12, base_family = "serif") +
theme(plot.title = element_text(face = "bold"),
plot.caption = element_text(colour = "grey40", hjust = 0))
)Wrap recurring labelling in a small helper so captions, source lines and sizing never drift between figures one and forty.
A pre-export checklist worth keeping:
| Check | Why it matters |
|---|---|
| Rate vs count chosen deliberately | Avoids population-growth artefacts |
| Axis types are Date/numeric | Decades and centuries land correctly |
Source cited in caption | Provenance travels with the figure |
| Palette colourblind-safe | Inclusive and print-robust |
| Uncertainty visible | No estimate disguised as fact |
Saved with ggsave() at fixed size | Reproducible dimensions and DPI |
Key Takeaways
- Build charts from tidy data and explicit aesthetic mappings.
- Plot rates, not raw counts, whenever the population changes over time.
- Convert dates to real
Date/numeric types before setting axis breaks. - Make uncertainty visible with ribbons, alpha or line types.
- Use viridis plus a second non-colour channel for accessibility.
- Lock a house theme with
theme_set()for collection-wide consistency. - Export with
ggsave()so figure dimensions and DPI are reproducible.
Frequently Asked Questions
Should I plot counts or rates for historical populations?
Plot rates when the underlying population changes over time, otherwise growth in raw counts just reflects more people. Counts are fine for closed collections like a fixed set of letters, but normalise to a denominator for demographic claims.
How do I show uncertainty in a ggplot2 chart?
Use geom_ribbon() or geom_errorbar() for ranges, lower the alpha on uncertain points, or shade reconstructed periods. Never present an estimated figure with the same visual weight as a directly attested one.
Why do my historical date axes look wrong?
ggplot2 needs a real Date or numeric type, not a character string. Convert with lubridate first, then control breaks with scale_x_date() or scale_x_continuous() so decades and centuries land on sensible ticks.
How do I make charts colourblind-safe?
Use scale_colour_viridis_d() or a tested palette, and never rely on colour alone. Add direct labels, line types or facets so the chart still reads in greyscale or for colourblind viewers.
How do I keep a consistent house style across many figures?
Define a theme once with theme_set() and a small wrapper function, then reuse it for every figure. This keeps fonts, sizing and captions uniform across a whole collection or publication.