When to Track word frequencies over time

Track word frequencies over time when you have a date-stamped, reasonably balanced corpus and a sharp question about how a specific concept rose or fell. Avoid it when your corpus is skewed across years, when your target words shifted spelling or meaning, or when per-year samples are too small to be stable. The method is cheap and powerful for the right question — and quietly misleading for the wrong one. The decision hinges on whether your archive's shape reflects history or just what survived.

Tracking word frequency over time — diachronic frequency analysis — counts how often a term appears per period, normalised by corpus size, to reveal trends. It is one of the most accessible cultural-analytics methods, but its simplicity hides assumptions that, if unmet, turn the result into an artefact of your data.

When does this method actually fit?

It fits squarely when three conditions hold:

Your texts are reliably date-stamped.
The corpus is reasonably balanced across the period (no decade swamps the rest).
Your target term is stable in form and meaning over the span.

Classic good cases: tracking influenza mentions in newspapers across an epidemic, or liberty in pamphlets across a revolution. The concept is named, datable and consistent.

When should I avoid it?

Walk away, or do heavy correction first, when:

Warning sign	Why it breaks the method
Corpus skewed to a few years	Trend tracks archive shape, not usage
Spelling drifts over the period	One concept splits across many tokens
Word meaning shifts (gay, awful)	Counting the string conflates senses
Tiny per-year samples	Frequencies bounce randomly
Mixed languages	Cross-language counts are meaningless

If two or more of these apply, a frequency chart will look authoritative while measuring nothing.

Why is relative frequency non-negotiable?

Surviving and digitised text varies enormously by year. Raw counts mostly measure volume of print, not salience. Always normalise:

python

import pandas as pd
df = pd.read_csv("tokens_by_year.csv")  # year, term, count, total_tokens
df["rate_per_10k"] = df["count"] / df["total_tokens"] * 10000
df.pivot(index="year", columns="term", values="rate_per_10k").plot()

Report rate per 10,000 tokens, and always plot the per-year total_tokens alongside — a "trend" over years with 200 tokens is not a trend.

How do I handle spelling and meaning drift?

Two different problems, two fixes:

Form drift — normalise variants to one lemma before counting: oeconomy -> economy, gaol -> jail. A small mapping table fixes this.
Sense drift — frequency cannot distinguish broadcast (sowing seed) from broadcast (radio). Here you must add a concordance or collocation check, reading the surrounding words to confirm the sense you are counting.

python

variants = {"oeconomy": "economy", "œconomy": "economy", "gaol": "jail"}
tokens = [variants.get(t, t) for t in tokens]

Skipping normalisation is the most common way a genuine trend gets flattened into noise.

What does this method cost versus alternatives?

Frequency tracking is the cheapest diachronic method — minutes to run, easy to explain. The trade-off is interpretive shallowness: it tells you that something changed, not how or why.

Need why a term clusters with others? Use topic modelling.
Need how a word's meaning shifted? Use diachronic word vectors.
Need the texture of usage? Read a concordance.

Frequency tracking is the right first pass and a poor last word.

How do I confirm a spike is real?

Three checks, every time:

Look at the per-year token count behind the spike — small denominators lie.
Trace the spike to its top documents and read them.
Perturb the tokenisation; a real trend survives, an artefact often does not.

A spike caused by one newly digitised, over-represented newspaper is an artefact of ingestion, not a fact about the past.

Key Takeaways

Use frequency tracking for sharp questions about a stable, named concept in a balanced, dated corpus.
Avoid it when the corpus is skewed or the target word drifts in form or meaning.
Always normalise to relative frequency and plot per-year token counts beside the trend.
Map spelling variants to one lemma; use concordances to police meaning drift.
It is the cheapest diachronic method but only tells you that something changed.
Validate every spike against its sample size and source documents.

Frequently Asked Questions

When is tracking word frequency over time the right method?

It fits when you have a date-stamped corpus that is reasonably balanced across the period and you want to track the rise, fall or seasonality of specific concepts. It is ideal for short, well-defined questions about named things.

When should I NOT use it?

Avoid it when your corpus is heavily skewed toward certain years, when the words you care about change spelling or meaning over the period, or when sample sizes per year are tiny. In those cases the trend reflects your archive, not history.

Why must I use relative not raw frequency?

Because the amount of surviving text varies hugely by year. Raw counts mostly track how much was printed or digitised, not how salient a word was. Dividing by total tokens per year gives a comparable rate.

How do I handle spelling changes over centuries?

Map historical variants to a single lemma before counting — 'oeconomy' and 'economy', or 'gaol' and 'jail'. Skipping this splits one concept across several tokens and flattens a real trend into noise.

Is Google Ngram Viewer good enough?

It is excellent for quick hypothesis generation across English print, but you cannot inspect its corpus, fix its OCR or control its sampling. For research you publish, build a corpus you can audit and document.

How do I know a spike is real?

Check the per-year token count behind it, trace the spike to specific documents and read them, and test whether it survives a change in tokenisation. A spike driven by one over-represented source is an artefact, not a trend.

When does this method actually fit? ​

When should I avoid it? ​

Why is relative frequency non-negotiable? ​

How do I handle spelling and meaning drift? ​

What does this method cost versus alternatives? ​

How do I confirm a spike is real? ​

Key Takeaways ​

Frequently Asked Questions ​

When is tracking word frequency over time the right method? ​

When should I NOT use it? ​

Why must I use relative not raw frequency? ​

How do I handle spelling changes over centuries? ​

Is Google Ngram Viewer good enough? ​

How do I know a spike is real? ​

Related reading ​