Appearance
Analysing temporal historical networks well means treating time as a first-class variable rather than flattening everything into one static graph. The core best practice is to slice your data into documented time windows (snapshots), compute metrics per window, and track how structure changes — all while reporting the source-survival context that shapes every number. Below is a working checklist that keeps results consistent and defensible across a whole collection.
Snapshots or continuous time: which model fits historical sources?
Most historical evidence does not support a true continuous-time event stream. A letter dated "summer 1623" or a charter witnessed "in the reign of..." cannot anchor a precise timestamp. The pragmatic default is discrete snapshots: split the period into fixed windows and build one graph per window.
Use continuous-time event models only when you genuinely trust day-level dates — modern diplomatic correspondence, telegraph logs, or notarial registers with exact entry dates. Even then, validate against snapshots first.
How wide should each time window be?
Window width trades resolution against statistical noise. Too narrow and each snapshot has a handful of edges and meaningless centrality scores; too wide and you erase the change you wanted to see. A practical rule: aim for roughly 20-50 edges per window.
| Source type | Typical span | Suggested window |
|---|---|---|
| Early-modern letters | 80-150 yrs | 5 years |
| Diplomatic dispatches | 10-30 yrs | 1 year |
| Census-linked kin ties | centuries | 1 census wave |
| Parish witness ties | decades | decade |
Always state the window width in your methods and test sensitivity by re-running at one step finer and one coarser.
A reproducible snapshot loop in Python
NetworkX has no native temporal type, so build snapshots explicitly. Keep one tidy edge list with a year column and slice it:
python
import pandas as pd
import networkx as nx
edges = pd.read_csv("letters.csv") # columns: source, target, year
windows = range(1600, 1700, 5)
snapshots = {}
for start in windows:
sub = edges[(edges.year >= start) & (edges.year < start + 5)]
G = nx.from_pandas_edgelist(sub, "source", "target")
snapshots[start] = G
print(start, G.number_of_nodes(), G.number_of_edges())Storing each snapshot in a dict lets you compute any metric per window and assemble a clean time series.
How do I track centrality over time without distortion?
Compute centrality inside each snapshot, then follow a node's rank across windows. Collapsing all years into one graph destroys the dynamics. A node may dominate one decade and vanish the next — that trajectory is the finding.
python
from collections import defaultdict
traj = defaultdict(dict)
for start, G in snapshots.items():
bc = nx.betweenness_centrality(G)
for node, score in bc.items():
traj[node][start] = round(score, 4)Plot the resulting trajectories. Watch for nodes that only "appear" central because a window happens to be sparse — a known artefact of thin survival.
How do I keep survival bias from faking a trend?
This is the deepest trap in temporal historical networks. A rising edge count across windows often reflects better-surviving records, not more historical activity. Before reading any upward curve as real, ask whether more sources simply survived from the later period.
Best practice: maintain a per-window denominator — the number of surviving items that could have produced an edge — and report metrics relative to it. Never publish a raw temporal trend without that context.
Handling uncertain dates defensibly
When an edge's date is a range, do not silently pick the midpoint. Run the whole analysis twice — once placing each edge at its earliest plausible date, once at its latest. If a structural claim holds under both extremes, it is robust; if it flips, you have learned the limit of your evidence and should say so.
Key Takeaways
- Default to documented discrete snapshots; reserve continuous-time models for genuinely day-precise sources.
- Size windows for roughly 20-50 edges and always run a sensitivity check at finer and coarser steps.
- Build snapshots with an explicit, reproducible slice loop rather than hidden manual filtering.
- Compute centrality per window and study trajectories, never one flattened graph.
- Report a per-window survival denominator so trends are not artefacts of differential survival.
- Test uncertain-date edges under best- and worst-case assignments to prove robustness.
Frequently Asked Questions
Should I use snapshots or a continuous-time model for historical data?
Snapshots (fixed time windows) suit most historical sources because dating is rarely precise enough for continuous-time event streams. Reserve continuous-time models for well-dated correspondence where you trust day-level precision.
How wide should each time window be?
Choose a window that holds roughly 20-50 edges so each snapshot is statistically meaningful but still resolves change. For early-modern letters, 5-year windows are a common defensible default.
How do I stop survival bias from distorting a time trend?
Document the archive's completeness per period and never read a rising edge count as a real rise in activity unless you can rule out differential survival. Report counts of surviving sources alongside every temporal metric.
Which tools handle temporal networks well?
Python with NetworkX plus a custom snapshot loop, the dedicated tnetwork or DyNetX libraries, and Gephi's Timeline filter for visual exploration are the common combination in historical work.
Can I compute centrality on a temporal network?
Yes, but compute it per snapshot and track how a node's rank changes over time rather than collapsing everything into one static graph, which hides exactly the dynamics you set out to study.
How do I handle edges with uncertain dates?
Assign a date range, then run the analysis under best-case and worst-case date assignments. If your conclusion survives both, it is robust to the dating uncertainty.