Best Practices to Reconstruct kinship networks

Reconstructing a kinship network well means three disciplined habits: link people to stable identities before drawing any tie, model relationships as typed edges (parent, spouse, sibling — not a generic "related to"), and record the confidence and evidence behind every inferred link. Done consistently across a collection, this yields a genealogy you can both analyse as a network and defend source by source. Here is the working checklist.

Which sources reconstruct kinship best?

No single source gives a full kinship network; you triangulate. The core register types and what each contributes:

Source	Relationships it yields
Baptism registers	parent-child, godparent ties
Marriage registers	spouse, witness, in-law links
Burial registers	death dates that close lifespans
Wills & testaments	named heirs, inferred siblings
Census returns	co-resident household structure

Cross-referencing several sources both adds relationship types and lets you confirm a link attested in more than one place — a key quality signal.

How should you model kinship — one network or several?

Model kinship as a typed multigraph: every edge carries its relationship type, and two people may share several edges (a spouse who is also a cousin). Collapsing everything into one undifferentiated tie throws away the very distinctions kinship analysis exists to study.

python

import networkx as nx
G = nx.MultiDiGraph()
G.add_edge("P0012", "P0034", rel="parent")   # P0012 is parent of P0034
G.add_edge("P0034", "P0041", rel="spouse")
G.add_edge("P0034", "P0055", rel="godparent")

Directionality matters for asymmetric relations (parent-child); keep symmetric ones (spouse, sibling) consistent in how you store both directions.

How do you handle the same name appearing repeatedly?

Early-modern naming reuses forenames relentlessly — three generations of "Giovanni" in one family. Blind name-matching is fatal here. Build stable person IDs first, using disambiguating evidence:

birth/marriage/death dates that must be biologically consistent;
place of residence or parish;
parents' names (the strongest disambiguator);
occupation or status markers.

Only after a person has a confident ID should you attach any kinship edge.

Person versus role: why the distinction matters

A common corruption is duplicating one person because they appear as a father in one record and a witness in another. Keep two layers:

Person — a unique individual with one stable ID.
Role — a position in a specific event (father, godparent, witness).

One person holds many roles across many events without ever becoming two nodes. Store roles as event participations that resolve to person IDs.

How do you represent uncertain or inferred relationships?

Much kinship is inferred, not stated — siblings deduced from shared parents, in-laws from a marriage. Never let inference masquerade as attestation. Attach to each edge:

python

G.add_edge("P0034", "P0070", rel="sibling",
           confidence="inferred",
           evidence="shared parents P0012/P0013")

Keep inferred ties distinguishable in both analysis and visualisation (e.g. dashed lines), so a reader can weigh them. Report the share of inferred versus attested edges in your methods.

A reconstruction checklist

text

[ ] one stable ID per person, before any linking
[ ] disambiguation evidence logged for each ID
[ ] edges typed by relationship (parent/spouse/sibling/...)
[ ] person and role kept as separate layers
[ ] inferred ties flagged with confidence + evidence
[ ] biological consistency checked (dates, generations)
[ ] attested-vs-inferred ratio reported

Run the consistency check programmatically — flag any parent younger than their child, or a marriage after a burial — to catch linkage errors before they propagate.

Key Takeaways

Triangulate parish registers, wills and census returns; no single source is complete.
Model kinship as a typed multigraph, never a generic "related to" tie.
Assign stable person IDs using dates, places and parents' names before linking — names alone mislead.
Separate person from role so one individual is never duplicated across events.
Flag inferred ties with confidence and evidence, and report the inferred-to-attested ratio.
Run biological-consistency checks (generation order, lifespan overlap) to catch linkage errors.

Frequently Asked Questions

What sources are best for reconstructing kinship networks?

Parish registers of baptisms, marriages and burials are the backbone, supplemented by wills, census returns and notarial records. Each supplies a different relationship type, so cross-referencing several sources is essential.

How should I model kinship — as one network or several?

Model it as a typed multigraph where each edge carries its relationship (parent, spouse, sibling), or as a GEDCOM-style genealogy. A single undifferentiated tie loses the distinctions that make kinship analysis meaningful.

How do I handle the same name appearing many times?

Use disambiguating evidence — dates, places, parents' names, occupations — to assign stable person IDs before linking. Naming patterns that reuse forenames across generations make blind name-matching unreliable.

What is the difference between a person and a role in kinship data?

A person is a unique individual; a role (father, godparent, witness) is a position they hold in a specific event. Keep them separate so one person can hold many roles without being duplicated.

How do I represent uncertain or inferred relationships?

Attach a confidence level and the inferring evidence to each edge, and keep inferred ties visually and analytically distinguishable from directly attested ones so readers can judge them.

Can I use GEDCOM for network analysis?

Yes. GEDCOM captures family structure well and can be converted to a graph, but you must translate its family-centred model into person-to-person edges before computing network metrics.

Which sources reconstruct kinship best? ​

How should you model kinship — one network or several? ​

How do you handle the same name appearing repeatedly? ​

Person versus role: why the distinction matters ​

How do you represent uncertain or inferred relationships? ​

A reconstruction checklist ​

Key Takeaways ​

Frequently Asked Questions ​

What sources are best for reconstructing kinship networks? ​

How should I model kinship — as one network or several? ​

How do I handle the same name appearing many times? ​

What is the difference between a person and a role in kinship data? ​

How do I represent uncertain or inferred relationships? ​

Can I use GEDCOM for network analysis? ​

Related reading ​