Skip to content
TEI & XML Encoding

To encode verse and drama in TEI, wrap each metrical line in l, group lines in lg, and model dialogue with sp (speech) containing a speaker label and the spoken l lines. Stage directions go in stage, and a castList in the front matter ties speaker labels to characters. Getting these four building blocks consistent across a collection is what separates a quick transcription from an edition you can query, render and trust.

What are the core elements for verse and drama?

The TEI verse and drama modules give you a small, stable vocabulary. Learn these and you can encode most plays and poems:

ElementPurposeCommon attributes
lone metrical linen, met, real, rhyme
lgline group (stanza, scene-speech)type, n
spa speechwho
speakerthe displayed speaker label
stagestage directiontype
castList / castItemthe dramatis personaexml:id

A short worked example shows how they fit together:

xml
<sp who="#hamlet">
  <speaker>Hamlet</speaker>
  <l n="56" met="-+-+-+-+-+">To be, or not to be, that is the question:</l>
  <l met="-+-+-+-+-+-">Whether 'tis nobler in the mind to suffer</l>
</sp>
<stage type="exit">Exit Ophelia.</stage>

How do you decide between l and p?

Use l only for verse — text the author intended as metrical lines. Prose passages, even prose spoken by a character, belong in p. Mixed plays are common: Shakespeare's clowns speak prose while nobles speak verse. The defensible rule is to follow the source's lineation, not your sense of rhythm. If the printed text breaks a line at a fixed point, that is an l; if it wraps freely to fill the measure, it is p.

How should you structure scenes and acts?

Model the hierarchy with nested div elements carrying type="act" and type="scene". Keep sp elements as children of the scene div, never floating at the act level. A clean skeleton:

xml
<div type="act" n="1">
  <div type="scene" n="1">
    <stage type="setting">A platform before the castle.</stage>
    <sp who="#bernardo"><speaker>Bernardo</speaker>
      <l>Who's there?</l></sp>
  </div>
</div>

This structure lets a stylesheet generate a navigable table of contents and lets you count speeches per character per scene with a single XPath.

Recording metre and rhyme

If your project analyses prosody, declare the notation once in the header so the symbols are documented:

xml
<metDecl xml:id="ms" type="met">
  <metSym value="+">stressed syllable</metSym>
  <metSym value="-">unstressed syllable</metSym>
</metDecl>

Then met on each l records the abstract pattern and real records what actually happens when the metre is broken. Recording rhyme="a" on lines lets you query rhyme schemes across a corpus — invaluable for cultural-analytics work on poetic form.

A working checklist before you call it done

Run every text through this list so the whole collection stays consistent:

  • Every metrical line is an l; prose is p; nothing is left as bare text in sp.
  • Each sp has a who that resolves to a castItem xml:id.
  • speaker labels are normalised (one canonical spelling per character).
  • Stage directions use a controlled type vocabulary, documented in your ODD.
  • Page and line breaks are milestones (pb, lb), never split l elements.
  • The document validates against your project schema, not just tei_all.

Why document your decisions in the header?

Two encoders will disagree about borderline cases — a half-line shared by two speakers, a song embedded in dialogue. Record your rules in encodingDesc so the next person follows the same conventions. A collection encoded to a written guideline is defensible in peer review; one encoded by instinct is not.

Key Takeaways

  • l for verse lines, p for prose — follow the source's lineation, not your ear.
  • Group lines in lg with a type; group speeches and acts with nested div.
  • Every sp needs a who pointing to a castItem; keep speaker labels normalised.
  • Use a controlled type vocabulary on stage and document it in your ODD.
  • Declare metre notation in metDecl before using met, real and rhyme.
  • Keep a shared l or page-spanning line whole; mark breaks with lb/pb.
  • Validate against a project schema and write your conventions into encodingDesc.

Frequently Asked Questions

Should I use lg or just l for verse in TEI?

Use l for every metrical line and group related lines in lg (line group), with a type such as stanza, couplet or speech. A bare run of l elements without lg is valid but loses the structure that downstream tools and stylesheets rely on.

How do I number verse lines without typing every number by hand?

Add n only where the source prints a line number, and let a processor count the rest. Most projects record the printed numbering in n and derive a continuous count with XSLT count(preceding::l) rather than hand-numbering every line.

What element holds a speaker's name in a play?

Wrap the spoken text in sp (speech) and put the speaker label in speaker, linking it to a cast list with who. The speaker element is display text; who carries the machine-readable pointer to castItem.

How do I encode stage directions in TEI drama?

Use stage with a type attribute such as entrance, exit, delivery or setting. Place it inside sp for directions tied to one speech, or between sp elements for action that stands alone.

Can I record metre and rhyme in TEI verse?

Yes. Use met, real and rhyme attributes on l or lg, and declare the notation in metDecl inside the header. This lets you query, for example, every iambic pentameter line or every couplet sharing a rhyme.

How should I handle a verse line that runs across a page break?

Keep the l element whole and insert an empty lb or pb milestone at the break point. Never split one metrical line into two l elements just because the page or column changed.