Appearance
Link authority files to Wikidata when you want your local records to participate in the global identifier graph — pulling in dates, alternate names and sibling identifiers, and letting other institutions discover that your "John Smith" is their John Smith. Do not link when you cannot confidently disambiguate the entity, when the records are sensitive, or when nobody will maintain the links. The decision turns on disambiguation confidence, public-ness, and sustainability, not on enthusiasm for linked data.
What signals say "yes, link this"?
Linking pays off fastest when:
- Your entities are public figures, creators or organisations likely already in VIAF/GND/ULAN.
- You hold other identifiers (a VIAF or ISNI) that make matching deterministic.
- You want enrichment: alternate spellings, birth/death dates, occupations, images.
- You are publishing collections as linked open data and need a reconciliation target.
What signals say "no, or not yet"?
- Low disambiguation confidence — many namesakes, thin biographical detail.
- Sensitive or living-person data where public linkage raises ethics or GDPR concerns.
- Hyper-local or unique entities with no plausible authority match.
- No maintenance budget — links rot as items merge, split and get vandalised.
How do I weigh the trade-offs?
| Factor | Benefit of linking | Cost / risk |
|---|---|---|
| Discovery | Your records surface via the Wikidata hub | Exposes data you may want private |
| Enrichment | Free dates, names, sibling IDs | Pulled data may be unsourced or wrong |
| Maintenance | Community fixes some errors for you | Merges/vandalism can break your links |
| Disambiguation | External IDs make matches certain | Sparse entities risk wrong matches |
| Standards | Becomes a 5-star LOD bridge | Upfront reconciliation effort |
How should I match authority records reliably?
Never match on label alone. A defensible workflow in OpenRefine:
text
1. Reconcile the name column against Wikidata (type: human, Q5).
2. Add a second feature: floruit/birth year or occupation.
3. Auto-accept matches above your confidence threshold (e.g. 90%).
4. Hand-review the 60–90% band against the source.
5. Leave the rest unmatched — an empty cell beats a wrong link.
6. Export QIDs, then write P214/P227/etc. back with QuickStatements.A single shared identifier (an ISNI both sides already hold) collapses ambiguity to near-zero.
Which direction should the link go?
Add the authority ID to the Wikidata item (e.g. P214 for VIAF) so the commons gains a connection, and store the QID in your local record so you can dereference Wikidata for live enrichment. Bidirectional linking means each side can find the other and your CMS can pull updates programmatically.
What is the real maintenance cost?
Links are not fire-and-forget. Items get merged (your QID becomes a redirect — usually fine), occasionally deleted, and sometimes vandalised. Budget a quarterly check that resolves your stored QIDs and flags redirects or 404s. If you cannot commit to that cadence, scope the linking to a stable, high-value subset rather than the whole file.
Key Takeaways
- Link when entities are public, well-attested and disambiguable; hold off otherwise.
- A wrong or unmaintained link is worse than no link — confidence and upkeep gate the decision.
- Wikidata complements, never replaces, your local authority file.
- Match on identifiers and dates, not labels; leave uncertain rows unmatched.
- Link both directions: authority ID onto Wikidata, QID into your record.
- Budget ongoing maintenance for merges, deletions and vandalism.
- For sensitive or living-person data, weigh ethics and privacy before exposing links.
Frequently Asked Questions
What does it mean to link an authority file to Wikidata?
It means adding the external authority identifier (such as VIAF, LCNAF or GND) to the matching Wikidata item as a property value, so the item becomes a hub that connects your local record to the global authority graph.
Which authority identifiers does Wikidata support?
Many. The most used in heritage are VIAF (P214), LC Name Authority / LCCN (P244), GND (P227), ISNI (P213), ULAN (P245) and ORCID (P496). Each has its own property and format constraint on Wikidata.
When should I NOT link to Wikidata?
Avoid it when your entities are too obscure or sensitive to be public, when you cannot confidently disambiguate the match, or when a transient project has no capacity to maintain the links. A wrong or unmaintained link is worse than none.
Does linking to Wikidata replace my local authority file?
No. Wikidata is a complement, not a replacement. Keep your authoritative local records; treat the Wikidata link as an interoperability bridge that lets you pull labels, dates and other identifiers, and expose your data to reusers.
How do I link in bulk without creating bad matches?
Reconcile in OpenRefine against Wikidata, match on more than the name (dates, occupation, an existing identifier), review low-confidence rows by hand, and only then write the identifiers back with QuickStatements.
Is it better to add the authority ID to Wikidata or store the QID locally?
Do both where you can. Adding the authority ID to Wikidata enriches the commons; storing the QID in your local record lets you dereference Wikidata for live enrichment. The two directions reinforce each other.