Skip to content
TEI & XML Encoding

For scholarly editions, choose TEI in almost every case. TEI P5 was designed precisely to model what editions contain — variant readings, manuscript description, named entities, datable uncertainty, editorial responsibility — none of which DocBook has elements for. DocBook is an excellent vocabulary for technical documentation and books, but using it for a source-critical edition means inventing or omitting exactly the structures that make an edition scholarly. Pick DocBook only when the deliverable is essentially a manual or expository book with no critical layer.

What is each format actually for?

DocBook, originating in the early 1990s, is a vocabulary for technical and reference documentation: software manuals, books, articles. Its strengths are a deep structural hierarchy and a famously complete publishing toolchain. TEI is a vocabulary for representing texts as objects of study in the humanities. The difference is purpose: DocBook describes how a document is organised; TEI describes what a text is and means, including its physical source and its textual history.

Which one models scholarly features?

This is the decisive axis. The table shows where the gap lies.

Feature an edition needsTEIDocBook
Variant readings / apparatusapp, lem, rdgnone
Manuscript descriptionmsDesc, physDescnone
Named entities (person, place)persName, placeNamenone
Datable / uncertain datesdate with @cert, @notBeforenone
Corrections, deletions, additionsdel, add, substnone
Editorial responsibilityrespStmt, @resppartial (info)
Structural sections, lists, tablesyesyes

DocBook matches TEI on ordinary structure but has no equivalent for the source-critical column. You cannot encode what the schema does not define.

When is DocBook genuinely the better pick?

Be honest about the deliverable. DocBook wins when:

  • The output is a technical manual, API reference, or software book.
  • You want polished PDF fast with minimal custom code.
  • There is no manuscript, no variants, and no editorial apparatus.
  • Your team already knows DocBook and the corpus is born-digital prose.

In those cases TEI would be over-engineering, and DocBook's mature stylesheets save real time.

How do the publishing toolchains compare?

DocBook ships with the DocBook XSL stylesheets: run them through a processor and you get HTML or, via XSL-FO and a FO engine, a respectable PDF with little configuration. TEI's output is more bespoke — you typically write or adapt XSLT (or use the TEI Stylesheets / CETEIcean for the web). So DocBook is more turnkey, TEI more flexible. For an edition you almost always want the flexibility, because the rendering of an apparatus or a choice element is editorial, not generic.

bash
# DocBook to PDF — largely off-the-shelf
xsltproc docbook.xsl edition.xml > edition.fo
fop edition.fo edition.pdf

# TEI to HTML — usually your own stylesheet
saxon -s:edition.xml -xsl:tei-to-html.xsl -o:edition.html

What if I started in the wrong format?

If you began in DocBook and now need scholarly features, structural conversion to TEI is doable with XSLT — sections, paragraphs, lists, and tables map fairly cleanly. But conversion cannot recover apparatus or manuscript description you never encoded, because DocBook gave you nowhere to put it. The cost of starting wrong is not the conversion script; it is the scholarship you failed to capture along the way. Choose deliberately at the outset.

A best-practice decision checklist

  1. Do the sources have variants, witnesses, or an apparatus? If yes, TEI.
  2. Do you need to describe physical manuscripts (msDesc)? If yes, TEI.
  3. Will you mark up people, places, and dated uncertainty for analysis? If yes, TEI.
  4. Is the deliverable essentially a manual or expository book? If yes, DocBook is viable.
  5. Do you need fast, standard PDF with no critical layer? DocBook's toolchain helps.
  6. Document the decision and rationale in your project's encoding guidelines either way.

Key Takeaways

  • TEI is purpose-built for editions; DocBook is purpose-built for documentation.
  • Only TEI has elements for apparatus, msDesc, named entities, and datable uncertainty.
  • Choose DocBook solely when the deliverable is a manual or book with no critical layer.
  • DocBook's publishing chain is more turnkey; TEI's is more flexible and editorially controllable.
  • Converting DocBook to TEI recovers structure but not scholarship you never encoded.
  • The deciding question is whether your sources have a critical or descriptive dimension.
  • Record your format choice and rationale in the project encoding guidelines.

Frequently Asked Questions

Is TEI or DocBook better for a scholarly edition?

TEI, in almost every case. It models the things editions need — variant readings, manuscript description, named entities, uncertainty — that DocBook simply does not have elements for. DocBook is a technical-documentation vocabulary.

When would DocBook actually be the right choice?

When your real product is a manual, software documentation, or a structured book with no source-critical layer, and you want DocBook's mature toolchain for PDF and HTML output via the DocBook XSL stylesheets.

Can I convert DocBook to TEI later if I start wrong?

Partly. Structural elements (sections, paragraphs, lists, tables) map reasonably with XSLT, but any scholarly apparatus you needed will simply be missing, because you never had elements to capture it. Migrating loses nothing only if there was nothing TEI-specific to begin with.

Does DocBook have anything like the TEI header?

DocBook has info blocks for titles, authors, and publication data, but nothing equivalent to sourceDesc, msDesc, or revisionDesc for describing manuscript sources and editorial responsibility at scholarly depth.

Which has better publishing tooling?

DocBook has the more turnkey publishing chain (DocBook XSL plus a FO processor gives polished PDF quickly). TEI's tooling is research-grade and more flexible but usually needs custom XSLT for output.

What is the single deciding question?

Ask whether your sources have a critical or descriptive dimension — variants, provenance, uncertainty, manuscript features. If yes, choose TEI. If the document is purely expository, DocBook is viable.