Appearance
The reliable way to link TEI transcriptions to IIIF is to bind each transcription element to a IIIF canvas region — a canvas URI with an #xywh=x,y,w,h fragment — either through TEI's @facs attribute pointing at coordinate zones, or through standoff IIIF annotations that reference the element's xml:id. The text and the pixels then travel together, so a viewer can highlight the exact spot on the page where a word was written.
What are we actually linking?
Three things must agree:
- A TEI element (a
<line>,<w>,<seg>) carrying transcribed text. - A region of an image, expressed as pixel coordinates.
- A IIIF canvas whose dimensions equal the image's full size in
info.json.
The link is the coordinate pair plus the canvas URI. Everything else is bookkeeping about where you store that pair.
How does TEI express the image side?
TEI's facsimile model lives in <facsimile> with <surface> and <zone> elements. A zone holds the coordinates; the transcription points at it with @facs:
xml
<facsimile>
<surface ulx="0" uly="0" lrx="4000" lry="6000"
xml:id="surf_12">
<zone xml:id="z_l1" ulx="842" uly="1190" lrx="1052" lry="1236"/>
</surface>
</facsimile>
<text>
<body>
<p><lb facs="#z_l1"/>Anno domini millesimo...</p>
</body>
</text>Note that TEI zones use corner coordinates (@ulx @uly @lrx @lry), while IIIF wants origin plus size (x,y,w,h).
How do I convert TEI zone coordinates to IIIF xywh?
The arithmetic is simple but must use the same pixel origin (top-left) as the canvas:
python
def zone_to_xywh(ulx, uly, lrx, lry):
return f"{ulx},{uly},{lrx-ulx},{lry-uly}"
# z_l1 above -> "842,1190,210,46"
print(zone_to_xywh(842, 1190, 1052, 1236))Then the canvas region URI becomes:
https://example.org/iiif/ms-123/canvas/12#xywh=842,1190,210,46If your TEI coordinates were measured on a thumbnail or a differently sized image, rescale first — multiply by canvas_width / measured_width.
Should the link sit in the TEI or in standoff annotations?
| Approach | Where links live | Best when |
|---|---|---|
Inline @facs | Inside the TEI document | Edition and coordinates are authored together |
| Standoff IIIF annotations | External AnnotationPage | Coordinates regenerated by OCR/HTR, edition kept clean |
For most digital editions I recommend standoff: keep the TEI free of coordinate noise, give every linkable element a stable xml:id, and emit a separate IIIF AnnotationPage that references those ids. You can then re-run layout analysis and rebuild coordinates without touching the scholarly text.
How do I produce annotations a viewer can highlight?
Generate an AnnotationPage whose annotations target canvas regions and carry the transcribed text as the body:
json
{
"@context": "http://iiif.io/api/presentation/3/context.json",
"type": "AnnotationPage",
"id": "https://example.org/iiif/ms-123/annos/page12",
"items": [
{
"type": "Annotation",
"motivation": "supplementing",
"body": {
"type": "TextualBody",
"value": "Anno domini millesimo",
"format": "text/plain",
"language": "la"
},
"target": "https://example.org/iiif/ms-123/canvas/12#xywh=842,1190,210,46"
}
]
}Reference that page from the canvas's annotations array, and Mirador or the Universal Viewer will show the text and highlight the region on hover or click.
What is a sensible end-to-end workflow?
- Encode the transcription in TEI; give every linkable element an
xml:id. - Obtain coordinates — from HTR (ALTO/hOCR) or manual zoning — at full canvas resolution.
- Convert corner coordinates to IIIF
xywh. - Emit a standoff AnnotationPage linking
xml:id→ canvas region → text body. - Reference the AnnotationPage from the manifest's canvases.
- Verify in a viewer: click a line, see it highlighted; click the image, see the text.
Key Takeaways
- The link is a canvas URI plus an
#xywhfragment tied to a TEI element. - TEI zones use corner coordinates; IIIF uses origin-plus-size — convert with
w=lrx-ulx,h=lry-uly. - Standoff IIIF annotations keep the scholarly TEI clean and regenerable.
- Canvas pixel coordinates must match the full image size in
info.json; rescale otherwise. - A
supplementingannotation with a TextualBody lets viewers display and highlight transcribed text. - Give every linkable TEI element a stable
xml:idso links survive re-editing.
Frequently Asked Questions
What is the cleanest way to link a TEI line to a IIIF canvas region?
Record the IIIF canvas URI plus a media fragment (#xywh=x,y,w,h) and associate it with the TEI element, either via the facs attribute pointing at a zone or via a standoff annotation referencing the element's xml:id.
Should the link live in the TEI file or in IIIF annotations?
Either works; standoff is cleaner. Keeping links as external IIIF annotations referencing TEI xml:ids avoids cluttering the transcription and lets you regenerate coordinates without re-editing the edition.
What does the facs attribute do in TEI?
facs points from a transcription element to the part of a facsimile it transcribes, usually a zone with coordinates inside a surface, or directly to a IIIF canvas region URI. It binds text to image.
How do coordinates from TEI zones map to IIIF xywh?
TEI zone @ulx,@uly,@lrx,@lry give upper-left and lower-right corners; IIIF xywh wants x,y,width,height. Convert with w = lrx - ulx and h = lry - uly, keeping the same pixel origin as the IIIF canvas.
Can I drive a Mirador highlight from a TEI transcription?
Yes. Emit a IIIF AnnotationPage whose annotations target canvas regions and carry the transcribed text as the body. Mirador and similar viewers then show the text and highlight the matching region.
Do canvas pixel coordinates match my original scan resolution?
They match the canvas dimensions declared in the manifest, which should equal the full image width and height in info.json. If your TEI zones were measured on a different-sized image you must rescale before mapping.