Skip to content
TEI & XML Encoding

To transform TEI to HTML with XSLT, write a stylesheet whose templates match TEI elements and emit HTML, bind the tei: prefix to the TEI namespace, and run it through an XSLT processor such as Saxon. A div becomes a section, a head becomes an h2, a hi becomes a styled span. Run saxon -s:edition.xml -xsl:tei2html.xsl -o:edition.html and you have a web page. The official TEI Stylesheets do this generically; write your own when you need control.

The two things that trip everyone up are the namespace and the teiHeader. Get those right and the rest is straightforward template-matching.

What do I need to set up?

Three pieces: your TEI file, a stylesheet, and a processor.

ToolXSLT versionNotes
Saxon (HE is free)2.0 / 3.0Recommended; grouping, regex, functions
xsltproc (libxslt)1.0 onlyLightweight, fine for simple transforms
TEI Stylesheets2.0Ready-made teitohtml command
CETEIceann/a (JS)Renders TEI in the browser, no XSLT

For anything beyond a trivial page, install Saxon HE and use XSLT 2.0 or 3.0.

Why is the namespace the first thing to get right?

TEI elements live in the namespace http://www.tei-c.org/ns/1.0. If your stylesheet matches <xsl:template match="p"> without a prefix, it matches nothing — your output comes out empty and you waste an hour. Bind a prefix:

xml
<xsl:stylesheet version="2.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:tei="http://www.tei-c.org/ns/1.0"
  exclude-result-prefixes="tei">

Now match="tei:p" works. This single mistake accounts for most "my XSLT does nothing" reports.

How do I write the core templates?

Start with a root template that builds the HTML shell, then match the elements you care about. Suppress the header explicitly so its metadata does not leak into the page:

xml
<xsl:template match="/">
  <html><head><title>
    <xsl:value-of select="//tei:titleStmt/tei:title"/>
  </title></head>
  <body><xsl:apply-templates select="//tei:text"/></body></html>
</xsl:template>

<xsl:template match="tei:teiHeader"/>            <!-- swallow metadata -->

<xsl:template match="tei:div">
  <section><xsl:apply-templates/></section>
</xsl:template>

<xsl:template match="tei:head">
  <h2><xsl:apply-templates/></h2>
</xsl:template>

<xsl:template match="tei:p">
  <p><xsl:apply-templates/></p>
</xsl:template>

<xsl:template match="tei:hi[@rend='italic']">
  <em><xsl:apply-templates/></em>
</xsl:template>

The empty tei:teiHeader template is essential: without it, XSLT's built-in rules copy the header's text into your page.

How do I handle names, places, and page breaks?

Carry your encoding through to useful HTML. Turn persName/@ref into a data attribute or link, and render pb as a visible marker:

xml
<xsl:template match="tei:persName">
  <span class="person" data-ref="{@ref}">
    <xsl:apply-templates/></span>
</xsl:template>

<xsl:template match="tei:pb">
  <span class="pagebreak" id="p{@n}">[p. <xsl:value-of select="@n"/>]</span>
</xsl:template>

Note the attribute value template {@ref} inside the HTML attribute — that is how you inject a TEI attribute value into output. With class="person" you can then style or script entities in the browser.

How do I run the transformation?

From the command line with Saxon:

bash
saxon -s:edition.xml -xsl:tei2html.xsl -o:edition.html
# or with the official TEI Stylesheets:
teitohtml edition.xml edition.html

For a corpus, loop over files. If you want client-side rendering with no build step at all, CETEIcean loads the raw TEI in the browser and maps elements to custom HTML elements you then style with CSS — a good fit for lightweight publishing.

What are the common gotchas?

  • Empty output → almost always the missing tei: namespace prefix.
  • Metadata in the page → unhandled teiHeader; add the empty template.
  • Whitespace mangling → control it with xsl:strip-space and xsl:text.
  • Apparatus dumped inline → write a deliberate template for tei:app (footnote or popover), or it renders as a run-on.
  • 1.0 vs 2.0 confusionxsltproc will reject 2.0 syntax; match your processor to your stylesheet version.

Key Takeaways

  • Bind the tei: prefix to http://www.tei-c.org/ns/1.0 first — unprefixed matches fail silently.
  • Suppress teiHeader with an empty template so metadata stays out of the page.
  • Match elements to semantic HTML: divsection, headh2, hiem.
  • Carry @ref into data- attributes and classes so entities stay actionable in HTML.
  • Use Saxon for XSLT 2.0/3.0; xsltproc only handles 1.0.
  • Use the TEI Stylesheets or CETEIcean for a fast result; hand-write XSLT for full control.

Frequently Asked Questions

What do I need to transform TEI to HTML?

An XSLT stylesheet, an XSLT processor (Saxon for XSLT 2.0/3.0, or xsltproc for 1.0), and your TEI file. You write templates that match TEI elements and output HTML, then run the processor to produce the page.

Should I write my own XSLT or use the TEI Stylesheets?

For a quick result, use the official TEI Stylesheets or a framework like CETEIcean. Write your own XSLT when you need full control over the layout, apparatus display, or custom interactivity that the generic stylesheets do not provide.

Why must I declare the TEI namespace in my XSLT?

Because TEI elements live in the http://www.tei-c.org/ns/1.0 namespace. If your match patterns omit the prefix, they silently match nothing and you get empty output. Bind a prefix like tei: to that namespace at the top of the stylesheet.

What XSLT version should I use?

XSLT 2.0 or 3.0 with Saxon is recommended for serious work: it has grouping, regular expressions, and functions that make TEI processing far easier than 1.0. xsltproc only supports 1.0 and is fine for simple transforms.

How do I handle the teiHeader so it does not appear in the output?

Add an empty template matching tei:teiHeader, or only apply templates to tei:text. By default XSLT's built-in rules copy text content, so an unhandled header dumps metadata into your page; suppress it explicitly.