Appearance
When EAD encoding breaks, the cause is almost always one of four things: malformed XML (unescaped characters or unclosed tags), wrong element order inside <did>, broken component nesting, or an encoding/charset mismatch. Validate against the official schema first — never trust a file that merely "opens" — and read the validator's line number and element name, which point you straight at the fault. Fix the structure, not the symptom.
Why does a file that opens fine still fail validation?
Opening without error only proves the XML is well-formed. Validity is stricter: EAD3 specifies which children an element may have and in what order. A common trap is the <did> (Descriptive Identification) block, whose children follow a defined sequence. Put <unittitle> after <unitdate> when the schema wants the reverse and you get a valid-looking but invalid file.
Run the schema check explicitly:
bash
xmllint --noout --schema ead3.xsd findingaid.xml
# or, for the RelaxNG grammar:
jing ead3.rng findingaid.xmlThe first reported line and element is your starting point. Fix it, re-run; errors often cascade from one root cause.
How do I diagnose the four most common errors?
| Symptom | Likely cause | Fix |
|---|---|---|
| Parser refuses to load file | Unescaped & or stray < in text | Escape as & / <, or use CDATA |
Valid XML, schema fails at <did> | Child elements out of required order | Reorder to the schema sequence |
| Hierarchy displays flat | Broken or mixed <c> nesting | Nest components consistently |
Accented text shows as é | Charset mismatch | Save UTF-8; set the XML declaration |
Work top-down: well-formedness first, then validity, then display.
What breaks component nesting most often?
EAD3 lets you express hierarchy two ways, and mixing them is the classic failure:
- Unnumbered: generic
<c>elements physically nested inside each other. - Numbered:
<c01>through<c12>, where the depth is in the tag name.
Pick one model per finding aid. If you nest a <c02> directly under <archdesc> without a <c01>, or drop an unnumbered <c> into a numbered tree, the hierarchy collapses. A minimal correct unnumbered skeleton:
xml
<dsc>
<c level="series">
<did><unitid>1</unitid><unittitle>Minutes</unittitle></did>
<c level="file">
<did><unittitle>Board minutes 1923</unittitle>
<unitdate normal="1923">1923</unitdate></did>
</c>
</c>
</dsc>How do I handle dates and special characters cleanly?
Two perennial offenders. For dates, always supply a machine-readable normal attribute in ISO form alongside the human-readable text:
xml
<unitdate normal="1886/1974">1886-1974</unitdate>Aggregators sort and facet on normal, not on the display string. For characters, save as UTF-8 and declare it:
xml
<?xml version="1.0" encoding="UTF-8"?>If you see mojibake like é for é, the bytes are UTF-8 but the parser was told otherwise — fix the declaration or the file's actual encoding, not both blindly.
Why does my finding aid look wrong after import to AtoM or ArchivesSpace?
Valid EAD can still import badly because each platform supports a subset of the schema and maps elements its own way. Before bulk-importing:
- Test one representative file end to end.
- Check that
<unitid>,<unittitle>and<unitdate>map to the fields you expect. - Confirm
levelattribute values match the platform's controlled list (fonds,series,file,item). - Review nesting depth — some tools cap or reshape very deep trees.
Fixing one file before importing 500 saves a painful cleanup.
Key Takeaways
- Always validate against the EAD3 schema; "it opens" only proves well-formedness.
- Most schema failures are element-order problems inside
<did>— read the validator's line and element. - Choose one component model (numbered or unnumbered) and never mix them.
- Save as UTF-8 and declare it; escape the five reserved characters or use CDATA.
- Add ISO
normalattributes to dates so aggregators sort and facet correctly. - Test a single file through your target platform before any bulk import.
Frequently Asked Questions
Why does my EAD file fail schema validation but look fine?
Most often the element order is wrong. EAD3 enforces sequence within elements (e.g. <did> children must follow a defined order), so a misplaced but well-formed element still fails. Validate against the RNG or XSD to see the exact position.
What is the difference between well-formed and valid EAD?
Well-formed means the XML syntax is correct (closed tags, one root, escaped entities). Valid means it also conforms to the EAD schema's rules about which elements may appear where. You need both.
Should I use EAD 2002 or EAD3?
Use EAD3 for new work; it has better support for dates, relations and standardised attributes. Stay on EAD 2002 only if a required aggregator or tool cannot yet ingest EAD3.
How do I fix encoding errors with special characters?
Ensure the file is saved as UTF-8 and that the XML declaration says so. Escape the five reserved characters (&, <, >, ', ") or wrap problematic content in a CDATA section.
Why are my component levels nesting incorrectly?
In EAD3 nesting is controlled by physically nesting <c> elements (or by the level attribute on numbered <c01>-<c12>). Mixing the numbered and unnumbered models, or breaking the nesting, causes the hierarchy to flatten.
Can I validate EAD from the command line?
Yes. Use xmllint --noout --schema ead3.xsd file.xml for XSD, or a Jing/Trang command for the RelaxNG grammar. Both report the line and element of the first failure.