Skip to content
Wikidata for Heritage

Reconciling a museum catalogue to Wikidata means matching your makers, places, materials and object types to existing Wikidata items and storing the resulting Q-identifiers, turning free-text fields into links to a shared graph. The practical workhorse is OpenRefine with its Wikidata reconciliation service. This guide runs the full workflow: prepare the data, reconcile column by column, resolve ambiguity, and decide whether to write back.

Step 1 — How do I prepare the catalogue export?

Export to CSV/TSV one entity per the columns you want to link, with stable internal IDs so you can re-merge later. Before loading into OpenRefine:

  • Split combined fields (Maker (dates) into a maker column and a dates column).
  • Trim whitespace and normalise obvious variants.
  • Keep your accession number as the unmovable key for every row.

A clean, atomic export is the difference between a smooth reconciliation and hours of false matches.

Step 2 — Which columns reconcile well?

Reconcile in order of confidence and value:

ColumnWikidata targetConfidence
Maker / artisthuman (Q5)high with dates/ULAN
Place of productionplace typeshigh
Materialmaterial itemshigh
Techniquetechnique itemsmedium
Object typeartefact classesmedium
Free-text descriptiondo not reconcile

Leave narrative description fields out — they are not entities.

Step 3 — How do I run reconciliation in OpenRefine?

Select the maker column, then Reconcile → Start reconciling → Wikidata. Constrain the type to human so candidates are people, not films of the same name. Then strengthen scoring by adding properties from other columns:

text
Reconcile → Start reconciling → Wikidata
  Type:  human (Q5)
  Also use:
    P569 (date of birth)   ← from your dates column
    P27  (country)         ← from a nationality column
    P245 (ULAN ID)         ← if you hold one (near-deterministic)

OpenRefine returns ranked candidates per cell. Auto-accept high scores; the middle band needs human eyes.

Step 4 — How do I resolve ambiguous matches?

The medium-confidence band is where quality is won or lost. For each uncertain cell, open the candidate in a new tab and check birth year, occupation and known works against your record. Decision rule:

  • Strong corroboration (two independent features agree) → accept.
  • Conflict or thin evidence → leave unmatched.

An unmatched cell is honest and fixable later; a confident wrong match silently corrupts every downstream query. Use facets to isolate "none matched" rows and work through them systematically.

Step 5 — Should I write back to Wikidata?

Reconciliation can be read-only: pull the QIDs into your collections system for enrichment and stop there. Write back only when:

  • The object or maker is notable and missing from Wikidata, and
  • You can supply sourced statements, and
  • You will maintain the contribution.

If you do write back, use OpenRefine's Wikidata schema to map columns to properties, preview the QuickStatements, and run a small test batch first.

Step 6 — How do I store and maintain the results?

Write the QID into a dedicated field in your CMS, keyed by accession number, so the link is reproducible. Because items merge and occasionally vanish, schedule a quarterly resolver that:

text
for each stored QID:
    fetch https://www.wikidata.org/wiki/Special:EntityData/QID.json
    if redirect  -> update to the target QID
    if 404       -> flag for human review

This keeps your catalogue's links alive with minimal effort.

Key Takeaways

  • Reconciliation links catalogue entities to Wikidata QIDs; OpenRefine is the standard tool.
  • Prepare atomic data keyed by accession number before loading.
  • Reconcile high-confidence columns first (maker, place, material); skip free text.
  • Score on multiple features (dates, nationality, ULAN) to beat namesakes.
  • Leave ambiguous cells unmatched rather than guess.
  • Writing back is optional — read-only enrichment is a valid endpoint.
  • Store QIDs and re-check them periodically to survive merges and deletions.

Frequently Asked Questions

What does reconciling a museum catalogue to Wikidata mean?

It means matching the entities in your catalogue — makers, materials, places, object types — to existing Wikidata items and recording the Q-identifiers. The result is a catalogue whose values are linked to a shared global graph rather than free text.

What tool should I use to reconcile a catalogue?

OpenRefine with its built-in Wikidata reconciliation service is the standard choice. It matches columns against Wikidata, lets you review candidates, and can write accepted matches back to Wikidata via QuickStatements or the Wikidata extension.

Which columns should I reconcile first?

Start with controlled, high-value columns: maker/artist, place of production, and material or technique. These have stable Wikidata coverage and high disambiguation confidence. Leave free-text description fields out of reconciliation.

How do I avoid wrong matches for common names?

Add supporting columns to the reconciliation — birth year, nationality, or an existing identifier like ULAN — so OpenRefine scores on more than the label. Hand-review medium-confidence candidates and leave the genuinely ambiguous ones unmatched.

Do I have to publish my catalogue on Wikidata to reconcile it?

No. Reconciliation can be read-only: you pull Q-identifiers into your local records for enrichment and interoperability without contributing anything back. Writing to Wikidata is a separate, optional step.

How do I keep the QIDs current after reconciliation?

Store the QID in your collections system and schedule a periodic check that resolves each QID, flagging redirects (from merges) and any that 404. Wikidata items move, so the link needs light ongoing maintenance.