Appearance
Write a minimal TEI header when the metadata is not yet stable and the priority is getting text encoded: early drafts, teaching files, single-use transcriptions and pilots. Skip the minimal approach the moment a document is bound for publication, a repository or OAI-PMH harvesting, because those systems read header fields and gaps there become real discovery and provenance failures. The header is cheap to expand later, but expensive to fix after a collection has shipped.
What is the absolute minimum?
The TEI schema requires only three children of fileDesc. Here is a valid header with nothing optional:
xml
<teiHeader>
<fileDesc>
<titleStmt>
<title>Letter from A. Smith to J. Brown, 1843</title>
</titleStmt>
<publicationStmt>
<p>Unpublished working transcription.</p>
</publicationStmt>
<sourceDesc>
<p>Transcribed from MS 1843/17, County Archive.</p>
</sourceDesc>
</fileDesc>
</teiHeader>That validates. It is enough to start encoding the same afternoon you open the manuscript.
When does a minimal header actually fit?
Reach for it when the answer to "will this metadata change soon?" is yes:
- Drafts and spikes — you are testing an encoding model, not publishing.
- Teaching — students should learn
bodymarkup without drowning inprofileDesc. - Single-use extraction — you need the text for one analysis, not a permanent edition.
- Rapid pilots — proving a workflow before committing to a metadata application profile.
In all of these, a heavy header is premature optimisation. You would polish provenance fields that the next iteration rewrites anyway.
When is a minimal header the wrong call?
The failure mode is shipping a thin header into a system that depends on it. Watch for these signals:
| Signal | Why a minimal header hurts |
|---|---|
| Going into a repository | DC/MODS crosswalks pull from titleStmt, publicationStmt |
| OAI-PMH harvesting | Aggregators map header fields to Dublin Core; gaps = poor discovery |
| Long-term preservation | sourceDesc and revisionDesc carry provenance auditors expect |
| Multi-encoder team | Without encodingDesc, conventions drift across files |
| Citable scholarly edition | Readers need responsibility, licence and edition statements |
If any of these apply, invest in the header now — retrofitting metadata across hundreds of files later is far costlier.
How do you upgrade a header without breaking anything?
The optional blocks are appended siblings, so you bolt them on without touching your transcription:
xml
<teiHeader>
<fileDesc> ... </fileDesc>
<encodingDesc>
<projectDesc><p>Diplomatic transcription; original spelling kept.</p></projectDesc>
</encodingDesc>
<profileDesc>
<langUsage><language ident="en">English</language></langUsage>
</profileDesc>
<revisionDesc>
<change when="2025-01-12" who="#er">Added editorial policy.</change>
</revisionDesc>
</teiHeader>Add encodingDesc when conventions need documenting, profileDesc for languages and participants, and revisionDesc the first time anyone edits the file.
Can you stop a header staying minimal forever?
The risk with "minimal for now" is that it never gets enriched. The only durable fix is to make richness a validation requirement: customise your ODD so that, say, publicationStmt must contain a licence and revisionDesc must exist. Then an under-filled header fails validation in CI, not in a reviewer's inbox six months on. That converts good intentions into an enforced standard.
What does this mean for a whole collection?
Decide the policy once, at the collection level, not file by file. A common pattern: encode against a minimal profile during transcription, then run every file through an enrichment pass that fills profileDesc and revisionDesc before deposit. Documenting that two-stage policy in your project handbook keeps the trade-off deliberate rather than accidental.
Key Takeaways
- The required minimum is
fileDescwithtitleStmt,publicationStmtandsourceDesc. - Minimal headers fit drafts, teaching, single-use jobs and pilots — unstable metadata.
- Avoid minimal headers for repository deposit, OAI-PMH harvesting and preservation.
- A minimal header is fully valid; richness is an editorial choice, not a validity one.
- Upgrade by appending
encodingDesc,profileDescandrevisionDesc— no disruption. - Enforce required header fields through your ODD so metadata cannot quietly stay thin.
Frequently Asked Questions
What is the minimum required in a TEI header?
TEI requires only fileDesc containing titleStmt (with a title), publicationStmt and sourceDesc. Everything else in teiHeader — encodingDesc, profileDesc, revisionDesc — is optional and added when you need it.
When should I write a minimal TEI header?
Use a minimal header for early drafts, single-use transcriptions, teaching examples and pilots where the metadata is not yet stable. It keeps you encoding instead of fighting metadata you will only revise later.
When is a minimal header a bad idea?
Avoid it for anything you will publish, deposit in a repository or harvest via OAI-PMH. Aggregators, citation tools and preservation systems read header fields, and gaps there become discovery and provenance problems.
Does a minimal header still validate against TEI?
Yes. A header with just titleStmt, publicationStmt and sourceDesc is fully valid against the TEI schema. Validity is about structure; richness is a separate editorial decision.
How do I upgrade a minimal header later?
Add the optional blocks incrementally: encodingDesc for your editorial rules, profileDesc for languages and people, and revisionDesc for the change log. Because they are appended siblings, retrofitting them does not disturb existing markup.
Can I enforce a richer header across a project?
Yes — customise your ODD to make selected header elements mandatory. That turns "we should fill this in" into a validation error, which is the only reliable way to keep a collection's metadata consistent.