Appearance
Minimum viable metadata is the smallest set of fields that still lets every item in a collection be uniquely identified, found, cited and preserved. In practice that means about six to nine mandatory elements — an identifier, a title, a creator or source, a date, a type, a rights statement and a pointer back to the original — captured consistently for every object before you spend effort on anything richer. The discipline is not "less metadata"; it is defensible metadata: a documented floor that every record clears, so nothing is unfindable, unattributable or legally unusable.
What counts as the irreducible core?
Start from function, not from a schema. Each mandatory field must earn its place by answering one of four questions: Which item is this? (identifier), Can a human recognise it? (title), Can it be cited and attributed? (creator, date), Can it be reused? (rights). A workable Dublin Core floor:
text
dc:identifier ARK or local accession no. (mandatory, unique)
dc:title natural-language label (mandatory)
dc:creator person/body or "Unknown" (mandatory, may be Unknown)
dc:date normalised EDTF (mandatory)
dc:type DCMI Type vocabulary (mandatory)
dc:rights rightsstatements.org URI (mandatory)
dc:source provenance / shelfmark (recommended)
dc:language ISO 639-2 code (recommended)"Unknown" is a valid value, but a blank is not — that distinction is what makes a minimal record auditable.
How do I decide which fields are mandatory?
Use an obligation ladder of three levels and force every field onto a rung:
| Obligation | Meaning | Empty allowed? |
|---|---|---|
| Mandatory | Record is rejected without it | No |
| Mandatory-if-applicable | Required when it exists (e.g. dc:language for text) | Conditional |
| Optional | Enriched later, never blocks ingest | Yes |
Keep mandatory to six–ten fields. The trap is "just one more" creep: every field you promote multiplies across the whole collection, so a 12,000-item set turns a one-minute addition into 200 hours of work.
A working checklist before you ingest
- Pick a base schema (Dublin Core unless a funder mandates MODS/MARC).
- Write a one-page application profile listing each field, its obligation, its vocabulary and one good/bad example.
- Choose a single identifier scheme and a date format (EDTF) — no exceptions.
- Define controlled values for
dc:typeanddc:rightsso they are machine-comparable. - Add three filled-in sample records to the profile as a reference.
- Run 20 pilot records and a quick
csvstatpass before scaling.
bash
# sanity-check a CSV export of the pilot
csvstat --null-value "" records.csv | grep -A1 "identifier\|rights\|date"
# every mandatory column should report 0 nullsWhy does rights belong in the minimum?
Because an item with no rights statement is, to most reusers, indistinguishable from "all rights reserved, do not touch." Adding a rightsstatements.org URI or a Creative Commons licence URL costs seconds per item but unlocks aggregation into Europeana, the DPLA and search engines. Leaving it for "phase two" silently strands the whole collection.
How do I keep the profile from bloating later?
Treat the field list as versioned policy. Any change requires a one-line rationale, a version bump and a backfill plan for the records already created — because a field that is mandatory for new items but blank on the old 8,000 is worse than not having it. Review the profile once a year, not once a week.
Minimal now, rich later — without rework
The reason a tight Dublin Core floor pays off is that it crosswalks cleanly. The same six fields map to MODS, MARC, Schema.org and RDF later without re-keying, so "minimal" is a starting point you grow from, not a ceiling you are stuck under.
Key Takeaways
- Minimum viable metadata is roughly six to nine mandatory fields covering identity, findability, citability and rights.
- Define obligation levels explicitly; "Unknown" is allowed, blank is not.
- Rights and a persistent identifier are non-negotiable parts of the floor.
- Capture the floor as a one-page application profile with examples and vocabularies.
- Resist field creep — every promoted field multiplies across the entire collection.
- Choose Dublin Core unless a funder forces otherwise; it crosswalks everywhere.
- Validate a 20-record pilot for null mandatory fields before scaling.
Frequently Asked Questions
What is minimum viable metadata?
It is the smallest set of fields that lets an item be uniquely identified, found, cited and preserved. For most digitised collections that is roughly six to nine fields: identifier, title, creator, date, type, rights and a source note.
How many fields should a minimal record have?
Aim for six to ten mandatory fields. Below six you usually lose findability or citability; above twelve you are no longer minimal and per-item effort rises sharply.
Is Dublin Core enough for minimum viable metadata?
Yes. The 15 Dublin Core elements plus a controlled identifier scheme cover almost every minimal use case, and they crosswalk cleanly to MODS, MARC and Schema.org later.
Should rights be part of the minimum?
Yes. Without a rights statement an item is legally unusable to most reusers, so a rightsstatements.org URI or licence URL belongs in the mandatory core even when everything else is sparse.
How do I stop a minimal profile from growing?
Write the field list as a one-page application profile with obligation levels, and require a documented decision and a backfill plan before any field moves from optional to mandatory.