Skip to content
Transkribus Workflows

Transkribus exposes a REST API that lets you script the full pipeline — upload, layout analysis, recognition, polling, and export — so a one-off click-through in the desktop app becomes a repeatable batch job. The standard pattern is: authenticate once to get a token, create or select a collection, upload each document's images, fire asynchronous layout and recognition jobs, poll each job by its ID until it finishes, then export PAGE XML or TEI. Everything below is the skeleton of that loop, with the gotchas that bite first-time automators.

What can the API actually do?

The API mirrors the app's pipeline as discrete, scriptable steps:

  • Auth — exchange credentials for a session token.
  • Collections / documents — create, list, and manage containers.
  • Upload — push page images into a document.
  • Layout analysis — detect regions and baselines (a job).
  • Text recognition — run a model over the pages (a job).
  • Export — download PAGE XML, ALTO, TEI, plain text, or PDF.

Layout and recognition are asynchronous: you start them, get a job ID back, and poll. Treating them as synchronous is the single most common beginner bug.

How do I authenticate without leaking credentials?

Log in once, capture the token, and reuse it. Keep secrets out of the code with environment variables.

bash
export TRANSKRIBUS_USER="[email protected]"
export TRANSKRIBUS_PASS="••••••••"
python
import os, requests

BASE = "https://transkribus.eu/TrpServer/rest"  # base path
s = requests.Session()

r = s.post(f"{BASE}/auth/login", data={
    "user": os.environ["TRANSKRIBUS_USER"],
    "pw":   os.environ["TRANSKRIBUS_PASS"],
})
r.raise_for_status()
# session cookie / token now lives in `s` for subsequent calls

Never hard-code the password, and never commit a token to version control.

How do I run recognition over a whole batch?

The repeatable core is a loop that processes one document, polls, and only then moves on — with a status poll so you do not hammer the server.

python
def recognise(col_id, doc_id, model_id):
    job = s.post(f"{BASE}/recognition/{col_id}/{doc_id}/{model_id}").json()
    job_id = job["jobId"]
    while True:
        st = s.get(f"{BASE}/jobs/{job_id}").json()["state"]
        if st in ("FINISHED", "FAILED"):
            return st
        time.sleep(15)            # back-off; do not poll in a tight loop

for doc_id in document_ids:
    state = recognise(COL, doc_id, MODEL)
    print(doc_id, state)

Add a try/except, log every job ID, and write a small manifest of which documents succeeded so a crash mid-batch is resumable.

What should I watch out for?

PitfallConsequenceMitigation
Treating jobs as synchronousEmpty exports, race conditionsAlways poll the job-status endpoint
Tight polling loopRate-limit / throttlingSleep 10-30s with back-off
No idempotency recordRe-running re-bills pagesTrack done docs in a manifest
Hard-coded credentialsLeaked secretsUse env vars / a secrets store
Ignoring HTTP errorsSilent partial batchesraise_for_status() everywhere

How do I export the results?

Once recognition finishes, request the format you need. PAGE XML is the richest because it keeps baselines, regions, reading order, and tags — ideal if the next stage is TEI conversion or a database.

python
exp = s.post(f"{BASE}/collections/{COL}/{doc_id}/export",
             json={"format": "PAGE"})
# then poll the export job and download the resulting zip

For human-readable proofs, request PDF; for an edition pipeline, request TEI and feed it straight into your encoding workflow.

Key Takeaways

  • The Transkribus REST API scripts the entire pipeline: auth, upload, layout, recognition, export.
  • Layout and recognition are asynchronous — start, get a job ID, poll until FINISHED/FAILED.
  • Keep credentials in environment variables and never commit tokens.
  • Throttle polling with a back-off and record completed documents so batches are resumable.
  • API recognition costs the same credits as the app — estimate before a large run.
  • Export PAGE XML for downstream encoding; PDF for proofs; TEI for editions.

Frequently Asked Questions

Does Transkribus have a public API?

Yes. Transkribus exposes a REST API that covers authentication, collection and document management, layout analysis, text recognition, and export, so you can script the entire pipeline without the desktop app.

How do I authenticate to the Transkribus API?

You log in to obtain a session token or use the OAuth flow, then send that token in the Authorization header of every subsequent request. Store credentials in environment variables, never in the script itself.

Can I run recognition on hundreds of documents automatically?

Yes. The typical loop is: create or select a collection, upload images per document, trigger a layout-analysis job, trigger a recognition job with your chosen model, poll until the job finishes, then export. Wrap it in a loop with rate limiting.

How do I know when an API job is done?

Recognition and layout jobs are asynchronous and return a job ID. Poll the job-status endpoint until the state reads finished or failed, with a back-off delay, rather than assuming completion.

What export formats can I pull via the API?

You can export PAGE XML, ALTO, TEI, plain text, and PDF among others. PAGE XML is the richest for preserving baselines, regions, and tags for downstream processing.

Does the API cost the same credits as the desktop app?

Yes, recognition billed through the API consumes the same per-page credits as a run in the desktop or web app. Automation saves human time, not credits, so estimate cost before launching a large batch.