Skip to content
Historical Gazetteers & Place Data

To model places that change over time, separate a place's permanent identity from its dated states: one stable identifier threads through the centuries, while names, boundaries and jurisdictions are recorded as time-bounded facts attached to it. When a place splits or merges, you create new identities linked back to the old one rather than overwriting anything. This is the single idea that lets a gazetteer hold history instead of a misleading present-day snapshot.

Why isn't one name and one point enough?

Consider a market town that was Sarisburia under the Romans, moved its centre downhill in the thirteenth century, and absorbed a neighbouring parish in 1894. A single row with one name and one coordinate erases all three events. Historical places are not static: they rename, relocate, split, merge and redraw boundaries. Modelling change is not advanced — it is the baseline requirement for the data to be true.

What is the identity-versus-state distinction?

This is the core concept, in plain language:

  • Identity is the unchanging thread: "this is the same place we have been talking about." It gets one stable ID.
  • State is what was true during a period: a name, a boundary, a county, a population. Each state has a start and end.

One identity owns many states. You never edit identity; you add and close states as evidence accumulates.

How do I record dated states in practice?

Give each attribute its own validity span. A tiny worked example as JSON:

json
{
  "place_id": "wilt-0001",
  "names": [
    {"toponym": "Sorviodunum", "start": -50, "end": 410},
    {"toponym": "Searobyrg",   "start": 552, "end": 1220},
    {"toponym": "Salisbury",   "start": 1220, "end": null}
  ],
  "jurisdiction": [
    {"county": "Wiltshire", "start": 1086, "end": null}
  ]
}

The null end means "still current." Notice each name carries its own dates — that is the whole trick.

How do I handle splits and merges?

Use relationships, never deletions. When parish A splits into A and B in 1894:

text
A (…–1894)  --splitInto-->  A2 (1894–…)
A (…–1894)  --splitInto-->  B  (1894–…)
A2, B       --derivedFrom--> A

The original identity gets an end date; the successors get start dates and a derivedFrom link. A later query for "1850" finds A; a query for "1900" finds A2 and B. Nothing is lost.

What date precision should I use?

Record only what you can defend. When a charter places an event "in the reign of Henry III," do not invent 1230 — express a range.

KnowledgeBadGood
Exact year12201220
Approximate1225start 1216, latest 1250
Century only1300start 1200, end 1299

The Extended Date/Time Format (EDTF) gives you standard notation for "uncertain" and "approximate" dates so software can reason about them.

Is this too much for a beginner project?

No, because you scale into it. Begin with dated name rows — that alone fixes the most common error. Add boundary geometry and split/merge relationships only once your sources actually show those changes. The discipline is to leave room to grow: never bake a single static name and coordinate into a place, even on day one.

Key Takeaways

  • Separate permanent identity from dated states; one ID, many time-bounded facts.
  • Give each attribute — name, boundary, jurisdiction — its own start and end dates.
  • Model splits and merges as relationships between identities, never as overwrites.
  • Express imprecise dates as ranges (or EDTF), not as fabricated single years.
  • Start with dated name rows and add complexity only as your sources demand it.
  • A static one-name, one-point record is a present-day snapshot, not history.

Frequently Asked Questions

Why can't I just store one name and one coordinate per place?

Because historical places rename, move their administrative centre, split, merge and shift boundaries. A single static record collapses all of that history into one misleading snapshot.

What is the difference between a place identity and a place state?

Identity is the continuous thread that says this is the same place across centuries; a state is what was true during one period (a name, a boundary, a parent jurisdiction). One identity holds many dated states.

How do I represent a place that split into two?

Keep the original identity, mark its end date, and create two new identities with start dates and a 'derivedFrom' link back to the parent. Relationships, not deletions, capture splits and merges.

Should every attribute have its own dates?

Yes, where it matters. A name, a boundary and a population figure can each have independent validity spans, so attach start and end years at the attribute level rather than the whole record.

What date precision should I use when I only know a century?

Record the range you can defend, such as a start between 1200 and 1250, using earliest and latest bounds rather than a single fabricated year. Period models like EDTF let you express this uncertainty explicitly.

Is this overkill for a small project?

Start simple with dated name rows and add boundary and relationship modelling only when your sources show change. The point is to leave room to grow, not to model everything on day one.