Appearance
To create a MODS record for a digitised item, capture eight things at minimum: the title, the responsible name with its role, the resource type, a structured date, the language, the physical description, the URL of the digital object, and record-source notes. Author them in MODS 3.7 XML, validate against the official schema, then ingest. Below is the end-to-end workflow I use for map, photograph and manuscript digitisation, with a complete worked record.
What goes into a complete MODS record?
Start from a checklist rather than a blank file. For a single digitised photograph the skeleton is:
xml
<mods xmlns="http://www.loc.gov/mods/v3" version="3.7"
xmlns:xlink="http://www.w3.org/1999/xlink">
<titleInfo>
<title>High Street, Ipswich, looking west</title>
</titleInfo>
<name type="personal">
<namePart>Cobbold, Felix</namePart>
<role><roleTerm type="text" authority="marcrelator">Photographer</roleTerm></role>
</name>
<typeOfResource>still image</typeOfResource>
<genre authority="aat">photographs</genre>
<originInfo>
<dateCreated encoding="w3cdtf" keyDate="yes">1903</dateCreated>
<place><placeTerm type="text">Ipswich, Suffolk</placeTerm></place>
</originInfo>
<language><languageTerm authority="iso639-2b">eng</languageTerm></language>
<physicalDescription>
<form authority="marcform">print</form>
<extent>1 photograph : gelatin silver ; 12 x 18 cm</extent>
</physicalDescription>
<location>
<url access="object in context">https://digitalrelics.uk/items/ips-0903</url>
</location>
<recordInfo>
<recordContentSource>Aether Forge Archive</recordContentSource>
<recordCreationDate encoding="w3cdtf">2024-12-03</recordCreationDate>
</recordInfo>
</mods>Every element here earns its place: keyDate="yes" tells aggregators which date to sort on; access="object in context" distinguishes a landing page from a raw file.
How do you handle names and roles correctly?
The single biggest quality win in MODS is typed names. Give every name a type (personal, corporate, conference) and a role/roleTerm drawn from the MARC relator list. Where an authority record exists, add valueURI and authority:
xml
<name type="personal" valueURI="http://id.loc.gov/authorities/names/n79021164"
authority="naf">
<namePart>Constable, John</namePart>
<role><roleTerm type="code" authority="marcrelator">art</roleTerm></role>
</name>Linking to a Library of Congress NAF or VIAF identifier is what later lets you crosswalk to linked data without re-disambiguating people.
How do you record the digital object itself?
A descriptive record that does not point at the file is half a record. Use location/url for access copies and relatedItem type="original" for the analogue source if you describe both. Add physicalLocation for the shelfmark of the original:
xml
<location>
<physicalLocation>SRO Ipswich, HD2418/4</physicalLocation>
<url access="raw object" usage="primary display">https://digitalrelics.uk/iiif/ips-0903/full/full/0/default.jpg</url>
</location>Can you build MODS records in bulk?
Yes, and you should for anything over a few dozen items. Keep cataloguers in a spreadsheet (one row per item) and template the XML. A compact Python approach:
python
import csv
from jinja2 import Template
tmpl = Template(open("mods.xml.j2").read())
with open("catalogue.csv", newline="", encoding="utf-8") as f:
for row in csv.DictReader(f):
xml = tmpl.render(**row)
open(f"mods/{row['id']}.xml", "w", encoding="utf-8").write(xml)This separates cataloguing (the spreadsheet) from serialisation (the template), so a non-XML person can do the description.
How do you validate before ingest?
Never ingest unvalidated records. Run the official schema over every file:
bash
xmllint --noout --schema mods-3-7.xsd mods/*.xmlA non-zero exit code lists each error with a line number. Common failures: wrong element order (MODS is order-sensitive within some sequences), a missing namespace declaration, or an encoding value that is not a permitted token.
What are the most common mistakes?
- Plain-text dates with no
encoding— unsortable and unqueryable. - Untyped names — you lose the surveyor/engraver/photographer distinction.
- Stuffing everything into
noteinstead of the right element. - Forgetting
keyDate="yes"so aggregators guess your sort date. - No
recordInfo, so provenance of the metadata itself is lost.
Key Takeaways
- Author MODS 3.7 and validate against
mods-3-7.xsdbefore every ingest. - Minimum viable record = title, typed name+role, resource type, structured date, language, physical description, URL, record info.
- Always type names and link to NAF/VIAF identifiers for future linked-data work.
- Use
encoding="w3cdtf"oredtfon every date and mark onekeyDate="yes". - Distinguish access copies from landing pages with
location/urlaccess attributes. - For volume work, catalogue in a spreadsheet and template the XML.
- Record metadata provenance in
recordInfo, not just object provenance.
Frequently Asked Questions
What are the minimum MODS elements for a digitised item?
A defensible minimum is titleInfo, name with a role, typeOfResource, originInfo (with a structured date), language, physicalDescription, location (the digital object URL) and recordInfo. Anything thinner is hard to disambiguate later.
Which MODS version should I use?
Use MODS 3.7, the current release from the Library of Congress (2018). Declare version="3.7" on the root element and validate against mods-3-7.xsd.
How do I record the digitised file location in MODS?
Use the location element with a url child. Add access="object in context" or access="raw object" attributes so a viewer can tell a landing page from a direct image link.
Should dates go in a date element or as plain text?
Always use a typed date element such as dateIssued or dateCreated with an encoding attribute (encoding="w3cdtf" or edtf). Plain-text dates cannot be sorted or range-queried reliably.
How do I validate a MODS record?
Run xmllint with the schema, e.g. xmllint --noout --schema mods-3-7.xsd record.xml. A clean exit code zero means the record is schema-valid; fix every reported error before ingest.
Can I generate MODS in bulk from a spreadsheet?
Yes. Keep one row per item with columns mapped to MODS paths, then template the XML with a script (Python plus lxml or a Jinja2 template). This keeps cataloguers in a spreadsheet while producing valid XML.